23,970 Matching Annotations
  1. Jul 2023
    1. eLife assessment

      This study is a valuable contribution to our understanding of vocal variation in acoustic displays of male baleen whales, part of a developing story about cultural change in songs in species other than the relatively well studied humpback whales. The authors present solid evidence of changes at various timescales in 20-Hz song note intervals and call center frequency over decadal time scales and large spatial scales.

    2. Reviewer #1 (Public Review):

      Romagosa, Nieukirk et al. present an interesting approach and interpretation to what is assumed to be a learned animal behavior. In this case, the observed behavior is fin whale (Balaenoptera physalus) singing and the analyses provide results indicating spatio-temporal variation in three fin whale song features at distinct locations within the Central and Northeast North Atlantic Ocean (ONA) region within a two-decade time period. The data set is a non-standardized collection of acoustic recordings obtained from multiple research scientists. Most of the acoustic recording samples are very sparse, with the majority of data coming from an area around the Azores and collected by Okeanos scientists. The senior author undertook the enormously demanding task of analyzing the acoustic data using non-automatic, standardized techniques and protocols. Songs from individual periods of singing on any given day were selected for analysis based on song quality. Song measurements included interval of time between successive 20-Hz song notes (INI), peak frequencies of those 20-Hz notes and peak frequencies of higher frequency notes (HF note). The resultant units of analysis are daily measures of INI (average and s.d.), 20-Hz note peak frequencies (average and s.d.), and HF note peak frequencies (average and s.d.). Several of the figures are confused by not representing the time axis in a typical, uniformly linear way (Fig. 2A and Fig. 3). This form of dynamic time warping smooths and distorts the time-varying features of the results and obscures the inherent sparseness of and high variability in the durations and locations of recordings in available data set. This fundamental characteristic of the available data (see Fig. S1), represents a form of sample aliasing, is not adequately addressed in the paper in terms of how it influences or restricts interpretation of the results. Another possible over-interpretation of results involves misrepresentation of the actual areas sampled. For example, data were collected on Dec 2007-Feb 2008 and Oct 2015 March from a recorder location off the southwest of the Iberian Peninsula. The acoustic sampling detection space is restricted to the ocean within some tens of kilometers of a single sensor, a very small dot on the maps in the manuscript, yet the data from this recorder are assigned to the relatively very large region referred to as the "Bay of Biscay & Iberian Coast". Within the two-decade period of the study (ca. 120 months), recordings were collected at this site (E in Figure 1) for 9 months (7.5%), and the two sampling periods occurred within the December 2007 through March 2018 time span (see Fig S1). It is scientifically inappropriate to translate this as data representing the Bay of Biscay & Iberian Coast as this kind of misrepresentation can lead to misinterpretation of the results.

      Despite these spatial and temporal sampling issues, the analyses reveal several important features (Fig. 2 and Fig. 3) about fin whale song in the ONA. The import of the analytical results is that the time span and spatial scale over which recordings were collected provide a unique opportunity to observe whether or not there were variations in fin whale song features within a large ocean region, across a span of two decades. One can consider these spatial and temporal scales appropriately matched to the known scales of fin whale natural history and ecology. Thus, the study results, although confronted by some sampling issues, are not biased by inappropriately sized spatial and temporal scales.

      This MS joins a small but growing list of papers documenting variability in baleen whale acoustic behaviors over ecologically appropriate spatial and temporal scales. These papers are primarily focused on singing, an acoustically obvious male reproductive display. As with several recent papers, the author takes advantage of a growing body of data collected during previous studies. The actual measurements utilized several established acoustic analysis software tools. The interpretation of the results focuses on evidence of vocal learning in fin whale singers (i.e. males performing reproductive displays) and wisely remains tangential to interpreting fin whale song through a cultural lens.

    3. Reviewer #2 (Public Review):

      This research brings togethor an impressively long timescale dataset of fin whale song vocalisations in the North Atlantic, measuring the note frequency content and inter-note intervals and thereby tracking shifts in both over time. Different time periods are covered in different regions of the north Atlantic during the course of the study. There are two principal results - the study documents a shift in the inter-note interval (INI) in an ICES eco-region termed 'Oceanic Northeast Atlantic' (although the relevance of this to fin whale populations is unclear) occuring relatively rapidly in the years 2000-2001. This shift is discontuous and appears to show an abrupt change in note intervals in most (though not all) of the songs recorded. The second key result is that this INI measure and also the peak frequency of song element termed the 'HF note' both show consistent directional change over timescales of 12 years. The INI measure begins to change back toward the value it held prior to the 2000/2001 shift, suggestive of a cyclical process of change coupled with resets. The average HF note peak frequency descended by about 5Hz during the study period but there was no evidence of abrupt shifts.

      The research significance is largely in the description of these processes in a new area, similar changes in rorqual song have been examined in the Southern Ocean and Mediterranean, and the argued interpretation of these changes as evidence for cultural learning processes in song change - the debate over whether these changes have environmental causation or are due to learning processes similar to song change in humpbacks is ongoing and this study therefore contributes interesting evidence from a newly covered population.

      I think the methods and analyses broadly support the claims but also that there are weaknesses in interpretation and presentation that should be addressed. I think perhaps the degree to which this is evidence of vocal learning may be a bit overplayed. Definitely there is change, but it is tricky to compare this to e.g. experimental demonstrations. For example, age-related changes in a changing post-whaling demographic scenario should at least be considered? Is there also any possibility for large-scale oceanographic variations to be included in some way - temperature shifts, for example? This could help understand the different roles of environment and learning in these processes. I think it is also important that these results be placed in a more detailed context of current knowledge of fin whale population structure in the north Atlantic - could population range shifts be a factor? The INI data show an interesting variation in the recordings from the Barents Sea and this could be discussed in the light of population structure knowledge also. It is unclear from the presentation whether the INI shift in 2000/2001 was coupled with any frequency shifts - if not, it suggests different trajectories and processes affecting these two aspects of the acoustic display.

      I am not convinced the main story here is about conformity, and I think it would be a mistake to too easily reach for the humpback comparison but there are certainly questions to be asked about the 2000/2001 shift in terms of the processes that led to it.

    4. Reviewer #3 (Public Review):

      The authors used passive acoustic monitoring over a vast range of the North Atlantic to study the call rates of fin whales. They found a 'take over' of a new rythm (inter call intervals) during their study period. This was interpreted as a change in song production.

      I am not completely convinced the authors are correct in describing this change in rate as a change in the song. Even though fin whale calls are evidently a male mating ground display, little is known about its function. Compared to humpback whales with their impressive repertoire of vocalizations, repeating themselves on the breeding grounds after some tens of minutes and therefore qualifying as a very slow 'song' similar to bird song, fin whale only emit a single type of call, which is remaining the same throughout the study period. It can be contested, I would assume, that a ,erely change the repetition rate of calls, even though seemingly done here in an 'overtake' fasion, can qualify as a change and learning of song,

    1. Author Response

      The following is the authors’ response to the current reviews.

      We will make some minor changes to address the issues in the revised manuscript during preparation of the Version of Record.

      1) Acknowledge the previous discovery that COUPTFII expression is confined to the ventral hippocampus in early human fetal forebrain (doi: 10.1093/cercor/bhx185).

      We agree. We will incorporate the previous discovery that COUPTFII expression is confined to the ventral hippocampus in early human fetal forebrain (doi: 10.1093/cercor/bhx185) in the discussion section of "COUP-TFII governs the distinct characteristics of the ventral hippocampus".

      2) Give some consideration to this observation from my original review "Abnormalities in the trisynaptic circuit. No studies of actual synapses, either physiological or morphological, were carried out. I wonder to what extent these immunohistochemical studies just further reflect the abnormalities in hippocampal morphology presented earlier in the manuscript without specifically telling us about synaptic circuits? Although the immunohistochemical preparations are beautiful, they are inadequate on their own in telling us much about what sort of synaptic circuitry exists in the transgenic animals".

      Our data in Figure 4 show clearly that at the neural circuit level, compared with the corresponding control, the trisynaptic circuit is abnormal in all three models; therefore, in the discussion section of "COUP-TF genes are imperative for the formation of the trisynaptic circuit", we will add the following sentence, "We would like to investigate what sort of synaptic circuitry is compromised either physiologically or morphologically in the trisynaptic circuit of individual animal model in detail in the future studies.

      In addition, we will correct a reference related to the COUP-TFII gene and congenital heart defects.

      The reference of "High, F. A., Bhayani, P., Wilson, J. M., Bult, C. J., Donahoe, P. K., & Longoni, M. (2016). De novo frameshift mutation in COUP-TFII (NR2F2) in human congenital diaphragmatic hernia. Am J Med Genet A, 170(9), 2457-2461. doi:10.1002/ajmg.a.37830" was replaced with "Al Turki, S., Manickaraj, A. K., Mercer, C. L., Gerety, S. S., Hitz, M. P., Lindsay, S., . . . Hurles, M. E. (2014). Rare variants in NR2F2 cause congenital heart defects in humans. Am J Hum Genet, 94(4), 574-585. doi:10.1016/j.ajhg.2014.03.007".

      —————

      The following is the authors’ response to the original reviews.

      Reviewer #1(Recommendations For The Authors):

      1) Better presentation of the western blot results

      We agree with the reviewer. Based on the suggestion, new information about the western blot results has been added in the revised Figure 1Ap. We added a dash to each western blot image to indicate the target band of COUP-TFI (46 KDa), COUP-TFII (45 KDa), and GAPDH (37 KDa), respectively. There were two bands in the blot of COUP-TFII, with the upper band corresponding to mouse IgG at 50 KDa, and the bottom band corresponding to COUP-TFII protein at 45 KDa. Therefore, only the lower bands of COUP-TFII are used for the quantitative analysis. The expression of COUP-TFII in the ventral hippocampus is clearly higher than that in the dorsal hippocampus.

      2) Full presentation of the Immunohistochemistry and qPCR results for at E11.5 and E14.5 in double knockdown mice.

      Thanks for the suggestion. Based on the suggestion, we added immunofluorescent data in the double knockout mice at E11.5 in the Figure 5Ba-h. Meanwhile, given that it takes time to prepare animal samples at E14.5 for RT-qPCR assays, we performed immunofluorescent assays at both E13.5 and E14.5 to make sure that the changes of Lhx5 and Lhx2 expression in the hippocampal regions between the control and mutant mice were consistent. As shown in the new Figure 5B, consistent with the downregulated expression of Lhx5 transcripts in the double mutant, the expression of the Lhx5 protein was reduced in the CH in the double mutants at E11.5; moreover, the numbers of Lhx5-positive Cajal-Retzius cells decreased in the double mutant embryos at E11.5, E13.5 and E14.5 (Figure 5Ba-d, a’-d’, a’’-d’’, i-l, i’-l’, q-t, q’-t’). Consistent with RT-qPCR data, the expression of Lhx2 was comparable between the control and double-mutant mice at E11.5 (Figure 5Be-h, e’-h’). Interestingly, the expression of the Lhx2 protein was increased in the hippocampal primordium in the COUP-TF double-mutant mice at E13.5 and E14.5 (Figure 5Bm-p, m’-p’, u-x, u’-x’). Please find the altered descriptions in the Page 15, lines 347-351, 353-358 and Page 21, lines 500-503 in the revised manuscript.

      3) Minor corrections. Lines 159-162, prospected not quite the right word. I would suggest "an ectopic CA-like region was observed medially in the temporal hippocampus in the COUP1TFII mutant, where the prospective posterior part of the medial amygdaloid nucleus was situated, (MeP), indicated by the star (Figure 1Ba-f). The presence of the ectopic CA-like region in the ventral but not dorsal hippocampus of the mutant was further confirmed by the presence of the prospective MeP and amygdalohippocampal area (AHi) in sagittal sections, as indicated by the star. See also line 251. Line437/438 I would suggest "... most important breakthroughs in understanding the role of the hippocampus in memory."

      Thanks for the suggestion. We made the changes based on the suggestion. Please find the amendments in Page 8, lines 178-181; Page 12, lines 270, 276; Page 14, line 318; Page 19, lines 451; Page 20, lines 461-462 in the revised manuscript.

      Reviewer #2 (Recommendations For The Authors):

      1) It is also important to point out that the immunofluorescence data in Figure 5B is contrary to what is known for Lhx5 (it's not expressed in the neocortical and hippocampal vz) and Lhx2 (it's not expressed in the choroid plexus). Authors should explain how their conclusions could align more clearly, and consider the possibility that their results are due to a possible artifact of image setting issues or worse, antibody specificity issues.

      Very good point. Based on the comments and suggestions, we first tested another Lhx5 antibody, R&D, Cat # AF6290, in the immunofluorescence assays. Indeed, there was something wrong with the previous Lhx5 antibody, Millipore, Cat # AB5762. With the new Lhx5 antibody, consistent with the reported in situ data, the expression of Lhx5 was detected specifically in the CH at E11.5, and in the Cajal-Retzius cells in the marginal zone of the telencephalon. The same Lhx2 antibody, Santa Cruz, Cat # sc-19344, which has been used successfully in one of our previous studies (Tang et al., Development, 2012) (PMID: 22492355), was used in the present study. We believe that the observations at the MP and DP of the samples are really associated with the expression of Lhx2 protein. We performed new immunofluorescence assays with the new Lhx5 antibody and confirmed with the Lhx2 antibody. As shown in new Figure 5B, consistent with the downregulated expression of Lhx5 transcripts in the double mutant, the expression of the Lhx5 protein was reduced in the CH in the double mutants at E11.5; moreover, the numbers of Lhx5-positive Cajal-Retzius cells decreased in the double mutant embryos at E11.5, E13.5 and E14.5 (Figure 5Ba-d, a’-d’, a’’-d’’, i-l, i’-l’, q-t, q’-t’). Consistent with RT-qPCR data, the expression of Lhx2 was comparable between the control and double-mutant mice at E11.5 (Figure 5Be-h, e’-h’). Interestingly, the expression of the Lhx2 protein was increased in the hippocampal primordium in the COUP-TF double-mutant mice at E13.5 and E14.5 (Figure 5Bm-p, m’-p’, u-x, u’-x’). Please find the changed descriptions in Page 15, lines 347-351, 353-358 and Page 21, lines 500-503 in the revised manuscript.

      The reference:

      Tang, K., Rubenstein, J. L., Tsai, S. Y., & Tsai, M. J. (2012). COUP-TFII controls amygdala patterning by regulating neuropilin expression. Development, 139(9), 1630-1639. doi:10.1242/dev.075564

      2) The expression domain of RxCre remains poorly explained, and the early expression of COUPTFI and II (E10.5-E12.5) could be considered major weaknesses of the paper.

      Thanks for the suggestion. The generation of RXCre was reported by Swindell et al., Genesis, 2006 (PMID: 16850473). Given that the activation of the LacZ expression serves as an indicator for the deletion of the COUP-TFII gene (Tang et al., Development, 2012) (PMID: 22492355), we performed the immunofluorescent data with antibodies against COUP-TFII and LacZ on the sagittal sections of RXCre/+; COUP-TFIIF/+ heterozygous mutant and RXCre/+; COUP-TFIIF/F homozygous mice at E11.5. As shown in the new Figure 1—figure supplement 1Da-f, COUP-TFII was readily detected at the hippocampal primordium of the heterozygous mutant embryo at E11.5 (Figure 1—figure supplement 1Da, c, g); in contrast, the expression of COUP-TFII significantly decreased in the homozygous mutant (Figure 1—figure supplement 1Dd, f, j). In addition, compared with the heterozygous mutant embryo, the LacZ signals increased distinctly in the hippocampal primordium of the homozygous mutant embryo at E11.5 (Figure 1—figure supplement 1Db-c, e-f, h, k), suggesting that RXCre recombinase can efficiently excise the COUP-TFII gene in the hippocampal primordium as early as E11.5. Please find the corresponding changes in Page 7, lines 149-159 and Page 8, lines 160-164 in the revised manuscript.

      Meanwhile, we also added the early expression of COUP-TFI and -TFII at E10.5 and E11.5 in new Figure 1—figure supplement 1Aa-d. At embryonic days 10.5 (E10.5), COUP-TFI was detected in the dorsal pallium (DP) laterally and COUP-TFII was expressed in the MP and CH medially (Figure 1—figure supplement 1Aa, b). At E11.5, the expression of COUP-TFII remained in the hippocampal primordium, including MP and CH (Figure 1—figure supplement 1Ac, d). Please find the corresponding changes in Page 6, lines 129-132 and Page 9, lines 202-203 in the revised manuscript.

      The references:

      Swindell, E. C., Bailey, T. J., Loosli, F., Liu, C., Amaya-Manzanares, F., Mahon, K. A., . . . Jamrich, M. (2006). Rx-Cre, a tool for inactivation of gene expression in the developing retina. Genesis, 44(8), 361-363. doi:10.1002/dvg.20225

      Tang, K., Rubenstein, J. L., Tsai, S. Y., & Tsai, M. J. (2012). COUP-TFII controls amygdala patterning by regulating neuropilin expression. Development, 139(9), 1630-1639. doi:10.1242/dev.075564

      Reviewer #3 (Recommendations For The Authors):

      1) Regarding the RxCre line, I was also confused about its spatiotemporal expression, as this line is not a commonly used Cre line and no detailed description is provided in the manuscript. Searching this line shows a previous paper by the authors (PMID: 22492355) in which they tested the RxCre recombinase activity. At E12.5, RxCre induced high LacZ expression in the ventral telencephalon but much less in the dorsal telencephalon. But they did not check later stage. Therefore, it's hard to explain the defective dorsal hippocampus in RxCre, CFI CKO. They should check later stage.

      The generation of RXCre was reported by Swindell et al., Genesis, 2006 (PMID: 16850473), which reveals high Cre recombinase activity of RXCre in the eye and ventral telencephalon. Given that the activation of the LacZ expression serves as an indicator for the deletion of COUP-TFII gene, Tang et al., Development, 2012 (PMID: 22492355), we performed the immunofluorescent data with antibodies against COUP-TFII and LacZ on the sagittal sections of RXCre/+; COUP-TFIIF/+ heterozygous mutant and RXCre/+; COUP-TFIIF/F homozygous mice at E11.5. As shown in new Figure 1—figure supplement 1D, compared with the heterozygous mutant embryo, the expression of COUP-TFII was significantly decreased in the homozygous mutant; in addition, the LacZ signals evidently increased in the hippocampal primordium of the homozygous mutant embryo at E11.5, suggesting that RXCre recombinase can efficiently excise the target gene in the hippocampal primordium as early as E11.5. The expression of COUP-TFI is barely detectable in the early developing hippocampal primordium including MP at E10.5, E11.5 and E12.5. The expression of COUP-TFI is high in the MP of the control (Figure 1Cj, l); in contrast, the COUP-TFI expression is barely detectable in the MP of the homozygous double mutant at E14.5, indicating that RXCre can efficiently delete the COUP-TFI gene in the hippocampal primordium at E14.5. The loss of the COUP-TFI gene in the MP as early as E14.5 by RXCre initiates the defective dorsal hippocampus in RXCre/+; COUP-TFIF/F knockout mice.

      2) Authors should check and review extensively for improvements to the use of English.

      We carefully checked and made changes throughout the manuscript accordingly. For example, “imperative” was used 6 times in the previous manuscript, lines 20, 255, 486, 499, 522, 553; “imperative” was used only once in Page 22, line 522 in the revised manuscript.

      3) Please correct the manuscript; 1-month-old mice are not adult mice.

      Thanks for the suggestion. Based on the suggestion, we have corrected related words and sentences in the manuscript. Please find the amendments in the revised manuscript (Page 7, line 146; Page 9, lines 203-204; Page 10, line 213; Page 13, lines 299-300; Page 17, line 406; Page 20, line 476).

      4) Additional ref should be added at line 93 on page 5.

      Based on the suggestion, we added some new references (Bertacchi et al., EMBO J, 2020) (PMID: 32572460); (Del Pino et al., Cereb Cortex, 2020) (PMID: 32484994); (J. Feng et al., Sci Adv, 2021) (PMID: 34215582) at line 96 on page 5.

      The references:

      Bertacchi, M., Romano, A. L., Loubat, A., Tran Mau-Them, F., Willems, M., Faivre, L., . . . Studer, M. (2020). NR2F1 regulates regional progenitor dynamics in the mouse neocortex and cortical gyrification in BBSOAS patients. Embo j, 39(13), e104163. doi:10.15252/embj.2019104163

      Del Pino, I., Tocco, C., Magrinelli, E., Marcantoni, A., Ferraguto, C., Tomagra, G., . . . Studer, M. (2020). COUP-TFI/Nr2f1 Orchestrates Intrinsic Neuronal Activity during Development of the Somatosensory Cortex. Cereb Cortex, 30(11), 5667-5685. doi:10.1093/cercor/bhaa137

      Feng, J., Hsu, W. H., Patterson, D., Tseng, C. S., Hsing, H. W., Zhuang, Z. H., . . . Chou, S. J. (2021). COUP-TFI specifies the medial entorhinal cortex identity and induces differential cell adhesion to determine the integrity of its boundary with neocortex. Sci Adv, 7(27). doi:10.1126/sciadv.abf6808

      5) I am confused why the authors analyzed 1-month-old mice in some instances but 3-month-old mice in others.

      The RXCre/+; COUP-TFIF/F; COUP-TFIIF/F double mutant mice barely survived beyond postnatal 3 weeks. To make our findings consistent and comparable, we mainly prepared figures with observations on about 1-month-old mice in the RXCre related single or/and double gene mutant mouse models. In the study of the Emx1Cre related COUP-TFI mouse model, due to behavioral tests such as the Morris water maze test, experiments were performed with the adult experimental animal about postnatal 3 months. In order to be consistent with the stage of the mice for the behavioral tests, we only displayed morphological data with observations on the control and Emx1Cre/+; COUP-TFIF/F mutant mice at about postnatal 3-month.

    2. eLife assessment

      This is an important study demonstrating distinct roles for the nuclear receptor genes COUP-TFI and COUP-TFII in hippocampal development. The strength of evidence is compelling, using rigorous state-of-the-art methods to demonstrate functional redundancy of these genes in regulating the Lhx2/Lhx5 axis. The major strengths of the study are the dramatic morphogenic phenotypes, and the resultant altered gene networks. These findings have theoretical or practical implications beyond a single field, and will be of interest to geneticists, developmental neurobiologists and chromatin biologists among others.

    3. Reviewer #1 (Public Review):

      The hippocampus is a structure in the cerebral cortex known to be compartmentalised into regions with different functions. Dorsal hippocampus is involved in cognitive functions such as declarative memory and spatial navigation and interconnects chiefly with the neocortex. Ventral hippocampus interconnects with limbic structures such as amygdala and hypothalamus and is involved in affective states and anxiety. What specifies this functional regionalisation during development is not well understood. The present study focuses on the role of transcription factors COUPTFI and COUPTFII, confirming a previously observed dorsal to ventral gradient of expression of COUPTFI in both embryonic and adult mouse hippocampus, and reporting that expression of COUPTFII is strongest in ventral hippocampus. The aim of the authors was then to probe the role of these transcription factors with the use of conditional knockout of one or both factors using RxCre+ mice (sometimes Emx1Cre+ for comparison). As predicted, COUPTFI insufficiency resulted in failure of the CA1 subregion of the dorsal hippocampus to develop properly (with concomitant loss of performance in a spatial memory task) COUPTFII knockdown had even more marked effects upon the ventral hippocampus with ectopic CA1/CA3 domains forming, while a double knockout lead to a drastic reduction in size of the hippocampus with subsequent effects upon the appearance of hippocampal synaptic circuitry and the capacity for adult neurogenesis (a feature of rodent hippocampus). In order to help explain the role of COUPTFI/II a role in regulating expression of two transcription factors LHX2 and LHX5, known to be crucial to hippocampal development, was tested by examining gene and protein expression. Changes in LHX2 and LHX5 was observed and a role for COUPTFI/II in regulating expression of these genes was postulated.

      I believe the authors have largely achieved their aims and the results mostly support the conclusions, but, as discussed further below, there are some weaknesses in the data and some areas that could be expanded upon and improved. The methods are mostly appropriate. The use of the transgenic mice and the application of histological methods, especially tyramide amplified immunohistochemistry, is exemplary. However, I'm not sure a wide enough range of tests to explore the phenotype of the transgenic mice was employed to back the conclusions drawn by the authors. The introduction and discussion are nicely written and explain the general concepts and conclusions well. The work makes an important contribution to our understanding of brain development in general and hippocampal development in particular.

      Turning to more specific comments, I must first point out that specification of the ventral hippocampus by expression of COUPTFII is not an entirely original finding, as it was suggested for the developing human hippocampus following immunohistochemical experiments illustrating COUPTFII expression to be confined to the ventral hippocampal structures of the medial temporal cortex (doi: 10.1093/cercor/bhx185). Of course, this study, unlike the present study, was restricted to fetal cortex, not adult, and also reported expression of COUP-TFI throughout dorsal and ventral hippocampal structures but without observing any dorsal to ventral gradient, however I feel its contribution to the field has been overlooked by the present study, and should be incorporated into the introduction and/or discussion.

      More information about Rx-cre mice would be informative and could help explain the different phenotype observed when EMX1-cre mice were used to conditionally knock down COUPTFI/II expression.

      The demonstration of antagonistic gradients of COUP-TFI and -TFII across the hippocampus is more convincing in the immunohistochemical preparations than in the western blots. The qualitative data presented in Fig.1p does not convincingly represent the quantitative data presented in Fig.1q. There seem to be multiple bands for COUP-TFII and I wonder exactly how quantifying this was approached?

      Behavioural testing is limited to one test of dorsal hippocampus function. other tests for non-spatial memory, e.g. novel object recognition, or ventral hippocampus function, e.g. step through passive avoidance, might have lead to some interesting discriminations between the various knock down animals (see doi: 10.3389/fnagi.2018.00091).

      Abnormalities in the trisynaptic circuit. No studies of actual synapses, either physiological or morphological, were carried out. I wonder to what extent these immunohistochemical studies just further reflect the abnormalities in hippocampal morphology presented earlier in the manuscript without specifically telling us about synaptic circuits? Although the immunohistochemical preparations are beautiful, they are inadequate on their own in telling us much about what sort of synaptic circuitry exists in the transgenic animals.

      LHX2/LHX5 interaction. The immunohistochemical study, which shows clear differences in LHX5 and LHX2 protein expression at E14.5 in double knockdown mice is more convincing than the qPCR study at E11.5, which show surprisingly small differences in mRNA expression. Could the authors expand upon whether this is due to stage of development, or differences between mRNA and protein expression? Why hasn't both mRNA and protein expression data at both time points been presented?

      Response to the re-submission

      I am happy that the western blot presentation has been improved, and my minor comments attended to. It is disappointing, although understandable given the timeframe, that the lack of qPCR data at 14.5 ED has not been rectified. The immunohistochemical data alone is qualitative and only indicative of LHX5 expression remaining depressed and LHX2 expression possibly increasing between E11.5 and E14.5. In the absence of qPCR data, a more quantitative immunohistochemical study, such as counting blind the number of LHX5+ Cajal-Retzius cells, or measuring optical density of LHX2 expression under rigorous experimental conditions regarding image collection and processing, would be required to support the hypothesis that COUPTFI/II expression modulates the LHX2/LHX5 axis.

    4. Reviewer #2 (Public Review):

      The Author's chose to limit their response to re-doing the Lhx5 immuno using the correct antibody which now displays the expected staining: Lhx5 expression is limited to the hem. They have not however presented a characterization of where the RxCre acts, although this was pointed out by other reviewers as well. It would have been useful to demonstrate the expression domain in particular with respect to the time of its initiation, to explain how it causes a phenotype close to that described for the Lhx5 knockout (Zhao et al., 1999). From the decrease of Lhx5 expression and the CR cells which arise from the hem, it appears that the RxCre does indeed act in the hem. However, the timing and spatial pattern is important to establish, as I had pointed out in my first review, "If [the expression of RxCre] it has a dorso-ventral bias in the early embryo, it could explain the regional difference in the COUPTF phenotypes."

      The major interpretive criticisms I made have not been addressed even though these would have only required a re-writing and re-interpretation of the data. The revised manuscript continues to include major errors of interpretation such as the idea that Lhx2 and Lhx5 "inhibit each other", something that is unsupported since the expression domains of these two genes are mutually exclusive as is clear from the authors' own new data and the literature.Lines 355-360: "The expression of Lhx2 was comparable between the control and double-mutant mice at E11.5 (Figure 5Be-h, e'-h'). Interestingly, the expression of the Lhx2 protein was increased in the hippocampal primordium in the COUP-TF double-mutant mice at E13.5 and E14.5 (Figure 5Bm-p, m'-p', u-x, u'-x'). The upregulation of Lhx2 expression is most likely associated with the reduced expression of the Lhx5 gene"There's clearly no Lhx5 in the hippocampal primordium so how is this possible?

      The authors have missed the insights from key papers that they cite, e.g. (lines 352-354) " The expression of Lhx2 was expanded ventrally into the choroid plexus in the Lhx5 null mutant mice (Zhao et al., 1999)" - this paper in fact shows there is no choroid plexus. Lhx2 appears to extend to the midline likely because the hem isn't specified. The authors would benefit from reading https://doi.org/10.1101/2022.10.25.513532 in which Lmx1a is shown to be the master regulator of the hem.<br /> A sentence like (lines 77-81) further blurs the literature: "Intriguingly, deficiency of either Lhx5 or Lhx2 results in agenesis of the hippocampus, and more particularly, these genes inhibit each other (Hébert & Fishell, 2008; Mangale et al., 2008; Roy, Gonzalez-Gomez, Pierani, Meyer, & Tole, 2014; Zhao et al., 1999), indicating that the Lhx5 and Lhx2 genes may generate an essential regulatory axis to ensure the appropriate hippocampal development"<br /> First, none of the papers they cite shows that these two factors inhibit each other. Second, the "agenesis of the hippocampus" in the Lhx2 knockout mentioned in Porter et al. (1997) was later shown to be due to a transformation of the hippocampal primordium into an EXPANDED hem (Mangale et al.) In contrast, the "agenesis of the hippocampus" in the Lhx5 mutant appears to be due to the near-complete LOSS of the hem and evidenced by the loss of its derivatives, the choroid plexus and the CR cells (Zhao et al., 1999). The fact that loss of these two factors have opposite effects on the hem (each resulting in loss of the hippocampus, one due to transformation of the hippocampal primordium into hem and the other because of a lack of hipopcampal induction) does not mean that there is an Lhx5-Lhx2 "axis" regulating hippocampal development.

      I won't repeat my other comments here, but the majority of them were not addressed in any way.

      In conclusion, I find it unfortunate that the authors have chosen not to use the detailed input provided by the reviewers which would have greatly improved their manuscript.

    5. Reviewer #3 (Public Review):

      The authors have made significant improvements in addressing my major concerns raised during the previous review. However, I still have some lingering concerns regarding the quantification and statistical analysis presented in the manuscript. Specifically, there is a lack of robust quantification and statistical analysis to support the conclusions drawn, particularly in relation to the numbers of DG, CA1, and CA3 neurons.

      To strengthen the validity and reliability of the findings, I would strongly recommend the authors to incorporate a more rigorous quantitative approach in their research. This could involve implementing stereological methods or other appropriate techniques to accurately estimate the numbers of neurons in the DG, CA1, and CA3 regions. By doing so, the authors would enhance the credibility of their conclusions and provide more solid evidence for their claims.

    1. Author Response

      The following is the authors’ response to the original reviews.

      We thank the reviewers for their comments. We have now addressed all the comments in a revised version of the manuscript, which we believe has strengthened our paper.

      1) Introduction LINE 60: the authors cite Funato et al 2016 as the paper first describing a role for SIk3 in sleep regulation. In fact, the role for this kinase was first identified nearly a decade earlier in C. elegans (Van der Linden et al, Genetics 2008 PMID 18832350).

      Thank you for pointing us to this reference. Van der Linden et al. demonstrated that the C. elegans homolog of Sik3 (KIN-29) regulates satiety quiescence, in which worms stop moving following feeding on high quality food. However, as pointed out in Trojanowski and Raizen “Call it Worm Sleep” (2016), not all of the behavioral criteria for sleep has been applied to C. elegans satiety quiescence, and we cannot find any references that unequivocally demonstrate satiety quiescence is a sleep state. As McClanahan et al., (2020) show, quiescent states following mild sensory arousal do not fulfill the sleep criteria of changes in arousal threshold and homeostatic regulation, so not all quiescent states in C. elegans are sleep. Then again Grubbs et al, 2020 does demonstrate that KIN29 regulates both developmentally timed and stress induced sleep states in worms, suggesting that the observations in Van der Linden were ahead of its time and these behavioral states are possibly inter-related. We believe, though, that our line “the roles of… SIK3 kinase in modulating sleep homeostasis in mice (Funato et al. 2016) were identified in genetic screens” remains accurate.

      2) Introduction LINE 71: remove the word "known" from "...while some known human sleep/wake regulators, such as the...")

      Good idea. Done.

      3) I was confused regarding Supplemental data 1 describing the genes they targeted with their forward genetic screen. Am I understanding correctly from the "Summary stats" tab that 702 fish lines with virus insertions were screened behaviorally? In Figure S1, it looks like about 60 are shown in the histograms but in the text (in the Discussion) they say 25 were screened. Were all the genes listed under the Excel tabs (GPCRs, channels, etc) tested? Or was just a subset tested? Where are the sleep data for these lines? Negative results may be relevant to their manuscript since they listed (tested??) a number of ion channel genes under tab "channels" which appear to NOT have a sleep phenotype.

      We apologize for the confusion on these points. As highlighted in the legend to Supplementary Figure S1, we had planned a screening strategy with the following pipeline: Candidate mammalian gene → Zebrafish ortholog → ID viral insertion from “Zenemark” library → grow viral insertion lines from frozen sperm→ phenotype F3 heterozygous and homozygous mutant generation. Unfortunately, the company, Znomics, which held the Zenemark library, could not reliably reconstitute the correct live fish from the sperm library, and of the 702 lines we planned to screen, we could only screen 26 (25 was a typo) lines. We treated heterozygous and homozygous animals for each line independently, for a total of 52 screened lines in the histograms.

      To make this clearer, we have edited the main text as follows (lines 104-105): “For screening, we identified zebrafish sperm samples from the Zenemark collection (Varshney et al., 2013) that harboured viral insertions in genes of interest and used these samples for in vitro fertilization and the establishment of F2 families, which we were able to obtain for 26 lines.” And lines 111-112: “While most screened heterozygous and homozygous lines had minimal effects on sleep-wake behavioural parameters (Figure S1B-S1C),”

      We believe it is important to include the full set of Supplementary Data 1, even though the vast majority of these candidate lines were not tested.

      4) Results LINE 117: remove the word "prominent", which is subjective, from the sentence "...showed a prominent decrease in sleep during the..."

      Good point. Done.

      5) LINES 185-186: did you see any circadian variation in your dmist:GFP protein abundance or localization? Protein trafficking has been described as a mechanism of circadian regulation of excitability.

      For practical reasons, we imaged the membrane localization of Dmist:GFP in plasmidinjected embryos at 90% epiboly, which is about 9 hours after fertilization and when the cells remain large and in a relatively flat epithelium. Thus, we could not follow circadian fluctuations in abundance or localization. For circadian studies, we believe the best method will be to raise an antibody that recognizes Dmist.

      6) LINE 203: does the GFP-tagged Dmist rescue the loss-of-function phenotype? This is relevant to Figure 2E. it is also relevant to the issue of structure-function. If it rescues, then the C-terminus may not be essential to protein function.

      As noted, for practical reasons, we observed Dmist-GFP only transiently at early stages of development, expressed using a strong, ubiquitous promoter. A rescue experiment is a good idea for future experiments, where we carefully control the expression of Dmist in neurons.

      7) LINE 220: explain what you mean by "...consistent with nonsense-mediated decay." and/or give a reference.

      In zebrafish and other species including humans, mutant transcripts that have premature stop codons often undergo “nonsense mediated decay”, whereby the expression levels are largely reduced (Wittkopp et al., 2009). In the zebrafish community, this is often used as secondary evidence of a loss of function mutation, as relatively few antibodies are available to directly observe zebrafish proteins. We have added a reference that describes this phenomenon (Wittkopp et al., 2009).

      8) LINE 225: define "LME model"

      Now reads: “Linear mixed effects (LME).”

      9) LINES 227-229: could the vir/vir phenotype be explained by specific effects on protein structure? could vir/vir be a gain-of-function allele?

      We can’t rule this out formally, and vir/+ animals do show some sleep phenotypes, albeit weaker than those of vir/vir animals (Figure 1G). However, it is not uncommon for heterozygous mutants to show significant phenotypes that are weaker than those of their homozygous mutant siblings, and the strong suppression of dmist expression by the viral insertion (which is located in the dmist intron) is more consistent with a hypomorphic loss-of-function phenotype for the vir allele.

      10) LINES 229-230: I don't quite follow the argument for pursuing further studies only of i8/i8. i8/i8 seems to also be a hypomorphic allele based on your qPCR data.

      First, the dmist viral line was generated by an insertional mutagenesis method followed by sequencing, and each line has multiple other inserts in a background that does not match the background of the other animals reported in this paper. Second, the dmist vir allele is an insertion in the intron, leading to reduced, but not complete loss of expression. In contrast, the i8 allele was generated on the same background strain as our other existing and newly reported lines. Moreover, our i8 line is likely a loss-of-function allele and not a hypomorph. Yes, dmist expression is reduced in the i8 allele; however, this is likely due to nonsense mediated decay of dmist mRNA. The mutation introduces a frameshift in the dmist coding sequence, and as a result the amino acid sequence of the protein is altered after the N-terminal signal sequence.

      11) LINES 241-243: grammar.

      Fixed

      12) LINE 245: define "JackHMMR iterative search"

      We’ve added the phrase: “and seeding a hidden Markov model iterative search (JackHMMR)”

      13) LINE 246 is missing the word "we" prior to "...found distant homology between..."

      Added

      14) LINE 301: show data demonstrating deviation from Mendelian ratios. Also, comment on meaning of such data (embryonic lethality??).

      We have added this data in the line (301):

      “atp1a3b mutant larvae were not obtained at Mendelian ratios (55 wild type [52.5 expected], 142 [105] atp1a3b+/-, 13 [52.5] atp1a3b-/-; p<0.0001, Chi-squared) suggesting some impact on early stages of development leading to lethality.”

      15) Discussion LINES 362-372: This paragraph seems to be of only tangential relevance to the paper. Consider removing.

      Our screening strategy was a large-scale reverse genetic screen, but the number of lines was limited by the technical issues described above. We think it is important to mention that the strategy, if employed today, could benefit from newer technologies.

      16) Discussion. Another model is that Dmist and NaK pump have a developmental effect. Arguing against this developmental model is the Oubain expt.

      This is an important point. We’ve added the line (454:457): “We also cannot exclude a role for Dmist and the Na+/K+ pump in developmental events that impact sleep, although our observation that ouabain treatment, which inhibits the pump acutely after early development is complete, also impacts sleep, argues against a developmental role.”

      17) FIGURE 1G: Are these significance cut offs corrected for multiple comparisons?

      Yes, all the data is corrected for multiple comparisons.

      18) performing neuronal activity measures, either via neural activity imaging or phospho-ERK labeling in different mutants at day or night conditions, to determine whether baseline neuronal activity brain-wide or in specific brain regions are altered.

      These are excellent experiments that we plan to perform in the future.

      19) Please check all Figure numbers for accuracy.

      We have double checked these.

      20) The authors emphasize the role of increased cellular sodium, but equally plausibly, the phenotypes could be due to decreased cellular potassium. The potassium channel shaker has been previously identified as a critical sleep regulator in Drosophila.

      We completely agree. We would like to highlight that we did devote an entire paragraph to the possibility of changes in extracellular potassium in the discussion: “A third possibility is that Dmist and the Na+,K+-ATPase regulate sleep not by modulation of neuronal activity per se but rather via modulation of extracellular ion concentrations. Recent work has demonstrated that interstitial ions fluctuate across the sleep/wake cycle in mice. For example, extracellular K+ is high during wakefulness, and cerebrospinal fluid containing the ion concentrations found during wakefulness directly applied to the brain can locally shift neuronal activity into wake-like states (Ding et al., 2016). Given that the Na+,K+-ATPase actively exchanges Na+ ions for K+ , the high intracellular Na+ levels we observe in atp1a3a and dmist mutants is likely accompanied by high extracellular K+. Although we can only speculate at this time, a model in which extracellular ions that accumulate during wakefulness and then directly signal onto sleep-regulatory neurons could provide a direct link between Na+,K+ ATPase activity, neuronal firing, and sleep homeostasis. Such a model could also explain why disruption of fxyd1 in non-neuronal cells also leads to a reduction in night-time sleep.”

      We also agree that Shaker may be an important component of this sleep regulatory mechanism. Indeed, we previously showed that another potassium channel in zebrafish regulates sleep (Rihel et al., 2010).

      We have emphasized sodium homeostasis in our title and paper only because we were able to directly observe intracellular sodium levels, so we are confident that these have been altered in our mutants. We can only presume that potassium levels have also been altered, but we could not directly observe this.

      21) The similar phenotype between dmist and Fxyd1 in sleep reduction yet very different expression patterns, with dmist being mostly neuronal while fxyd1 being mostly non-neuronal, raise many possible questions: 1) are the sleep phenotypes due to neuronal Na/K imbalance? Or 2) Are the sleep phenotypes due to extracellular Na/K imbalance? Or 3) both? Some feasible experiments may help achieve a better mechanistic understanding of the observed sleep defects.

      Yes, we think these are excellent studies for future work. As noted in the previous point (20), we did discuss the possibility that changes to extracellular potassium might be a parsimonious explanation for the similar phenotypes of fxyd1 and dmist mutants.

      Future experiment suggestions (not required)

      1) Perform a double mutant analysis of fxyd1 and atp1a3a, to determine whether an epistatic relationship similar to that of dmist and atp1a3a is observed in the case of fxyd1 and atp1a3a.

      This is a great experiment that we will do in the future. Unfortunately, the fxyd1 mutant had been sperm frozen during the COVID-19 pandemic, so we cannot do this experiment at this time.

      2) Given the differences in the sleep phenotypes between vir/vir and i8/i8 mutants, would be informative to see the phenotype of the vir/i8 trans-heterozygote.

      This is also a good experiment to perform in the future. Since obtaining the cleaner i8 allele, the dmistvir/vir lines were sperm frozen.

    2. eLife assessment:

      This study offers new fundamental information on a role for the sodium/potassium pump in sleep regulation. Elegant methods were used to provide compelling evidence supporting the claim. The work will be of interest to sleep researchers in zebrafish as well as in other species for future investigation.

    3. Reviewer #1 (Public Review):

      Barlow et al performed a viral insertion screen in larval zebrafish for sleep mutants. They identify a mutant named dreammist (dmist) that displayed defects in sleep, namely, decreased sleep both day and night, accompanied by increased activity. They find that dmist encodes a previously uncharacterized single-pass transmembrane protein that shows structural similarity to Fxyd1, a Na+K+-ATPase regulator. Disruption of fxyd1 or atp1a3a, a Na+,K+-ATPase alpha-3 subunit, decreased night-time sleep. By staining for sodium levels, the authors uncover a global increase of sodium in both dmist and atp1a3a mutants following PTZ treatment, consistent with defects in Na+K+-ATPase function. These genetic data from multiple mutant lines help establish the importance of sodium and/or potassium homeostasis in sleep regulation.

      The conclusions of this paper are mostly well supported by data, with the following strengths and weaknesses as described below.

      Strengths:<br /> Elegant use of CRISPR knockout methods to disrupt multiple genes that help establish the importance of regulating Na+K+-ATPase function in sleep.<br /> Data are mostly clearly presented.<br /> Double mutant analysis of dmist and atp1a3a help establish an epistatic relationship between these proteins.

      Weaknesses:<br /> The authors emphasize the role of increased cellular sodium, but equally plausibly, the phenotypes could be due to decreased cellular potassium. The potassium channel shaker has been previously identified as a critical sleep regulator in Drosophila.<br /> Although the increased sleep rebound after PTZ treatment in the dmist mutant is interesting, I find it difficult to understand, especially in the context of the dmist mutant having decreased sleep.

      The similar phenotype between dmist and Fxyd1 in sleep reduction yet very different expression patterns, with dmist being mostly neuronal while fxyd1 being mostly non-neuronal, raise many possible questions: 1) are the sleep phenotypes due to neuronal Na/K imbalance? Or 2) Are the sleep phenotypes due to extracellular Na/K imbalance? Or 3) both? Some feasible experiments may help achieve a better mechanistic understanding of the observed sleep defects.

    4. Reviewer #2 (Public Review):

      Barlow and colleagues describe a role for the Na+/K+ pump in sleep/wake regulation. They discovered this role starting with a forward genetic screen in which they tested a biased sample of virus insertion fish lines for sleep phenotypes. They found an insertion in a gene they named dreammist, which is homologous to the gene FXYD1 encoding single membrane-pass modifiers of Na/K pumps. They go on to show that genetic manipulations of either FXYD1 or the Na/K pump also reduce sleep. They use pharmacology and sleep deprivation experiments to provide further evidence that the NA/K pump regulates intracellular sodium and rebound sleep. This study provides additional evidence for the important role of membrane excitability in sleep regulation (prior studies have implicated K+ channel subunits as well as a sodium leak ion channel).

      The study is well done and convincing with regard to its major conclusions. I had some minor comments/questions, which they properly addressed in their revision and rebuttal.

    1. Author Response

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      In this manuscript, the authors investigated the role of Elg1 in the regulation of telomere length. The main role of the Elg1/RLC complex is to unload the processivity factor PCNA, mainly after completion of synthesis of the Okazaki fragment in the lagging strand. They found that Elg1 physically interacts with the CST (Cdc13-Stn1- Ten1) and propose that Elg1 negatively regulates telomere length by mediating the interaction between Cdc13 and Stn1 in a pathway involving SUMOylation of both PCNA and Cdc13. Accumulation of SUMOylated PCNA upon deletion of ELG1 or overexpression of RAD30 leads to elongated telomeres. On the other hand, the interaction of Elg1 with Sten1 is SIM-dependent and occurs concurrently with telomere replication in late S phase. In contrast Elg1-Cdc13 interaction is mediated by PCNA-SUMO, is independent on the SIM of Elg1 but still dependent on Cdc13 SUMOylation. The authors present a model containing two main messages 1) PCNA- SUMO acts as a positive signal for telomerase activation 2) Elg1 promotes Cdc13/Stn1 interaction at the expense of Cdc13/Est1 interaction thus terminating telomerase action.

      The manuscript contains a large amount of data that make a major inroad on a new type of link between telomere replication and regulation of the telomerase. Nevertheless, the detailed choreography of the events as well as the role of PCNA- SUMO remain elusive and the data do not fully explain the role of the Stn1/Elg1 interaction. The data presented do not sufficiently support the claim that SUMO- PCNA is a positive signal for telomerase activation.

      We thank the reviewer for her/his review efforts and opinion. We have re-submitted a new version of the manuscript in which we clarify some of the criticisms presented. In a point-by-point letter we respond to all the specific queries.

      Reviewer #2 (Public Review):

      This paper purports to unveil a mechanism controlling telomere length through SUMO modifications controlling interactions between PCNA unloader Elg1 and the CST complex that functions at telomeres. This is an extremely interesting mechanism to understand, and this paper indeed reveals some interesting genetic results, leading to a compelling model, with potential impact on the field. The conclusions are largely supported by experiments examining protein-protein interactions at low resolution and ambiguous regarding directness of interactions like co-IP and yeast two-hybrid (Y2H) combined with genetics. However, some results appear contradictory and there's a lack of rigor in the experimental data needed to support claims. There is significant room for improvement and this work could certainly attain the quality needed to support the claims. The current version needs substantial revision and lacks the necessary experimental detail. Stronger support for the claims would add detail to help distinguish competing models.

      We thank the reviewer for her/his positive opinion. We have re-submitted a new version of the manuscript in which we clarify some of the criticisms presented by thereferees, and added all the missing experimental details. In a point-by-point letter we respond to all the specific queries.

      Reviewer #3 (Public Review):

      This paper reveals interesting physical connections between Elg1 and CST proteins that suggest a model where Elg1-mediated PCNA unloading is linked to regulation of telomere length extension via Stn1, Cdc13, and presumably Ten1 proteins. Some of these interactions appear to be modulated by sumolyation and connected with Elg1's PCNA unloading activity. The strength of the paper is in the observations of new interactions between CST, Elg1, and PCNA. These interactions should be of interest to a broad audience interested in telomeres and DNA replication.

      We thank the reviewer for her/his positive opinion. We have re-submitted a new version of the manuscript in which we clarify some of the criticisms presented. In a point-by-point letter we respond to all the specific queries.

      What is not well demonstrated from the paper is the functional significance of the interactions described. The model presented by the authors is one interpretation of the data shown, and proposes that the role of sumolyation is temporally regulate the Elg1, PCNA and CST interactions at telomeres. This model makes some assumptions that are not demonstrated by this work (such as Stn1 sumolyation, as noted) and are left for future testing. Alternative models that envision sumolyation as a key in promoting spatial localization could also be proposed based on the data here (as mentioned in the discussion), in addition to or instead of a role for sumolyation in enforcing a series of switches governing a tightly sequenced series of interactions and events at telomeres. Critically, the telomere length data from the paper indicates that the proposed model depicts interactions that are not necessary for telomerase activation or inhibition, as telomeres in pol30-RR strains are normal length and telomeres in elg1∆ strains are not nearly as elongated as in stn1 strains. One possibility mentioned in the paper is the PCNAS and Elg1 interactions are contributing to the negative regulation of telomerase under certain conditions that are not defined in this work. Could it also be possible that the role of these interactions is not primarily directed toward modulating telomerase activity? It will be of interest to learn more about how these interactions and regulation by Sumo function intersect with regulation of telomere extension.

      We present compelling evidence for a role of SUMOylated PCNA in telomere length regulation. Figure 1 shows that this modification is both necessary and sufficient to elongate the telomeres, indicating that PCNA SUMOylation plays a positive role in telomere elongation. The model we present is consistent with all our results. There are, of course, possible alternative models, but they usually fail to explain some of the results. We agree that the fact that pol30-RR presents normal-sized telomeres implies that SUMO-PCNA is not required for telomerase to solve the "end replication problem", but rather is needed for "sustained" activity of telomerase. Since elongated telomeres (by absence of Elg1 or by over-expression of SUMO-PCNA) was the phenotype monitored, this may require sustained telomerase activity. Similar results were seen in the past for Rnr1 (Maicher et al., 2017), and this mode depends on Mec1, rather than Tel1 (Harari and Kupiec, 2018). Telomere length regulation is complex, and we may not yet understand the whole picture. It appears that for normal “end replication problem” solution, very little telomerase activity may be needed, and spontaneous interactions at a low level may suffice. Future work may find the conditions at which telomerase switches from "end replication problem" to "sustained" activity. We have added further explanations on this subject to the Discussion section.

      We suspect, but could not prove, a role for Stn1 SUMOylation in the interactions. SUMOylation is usually transient, and notoriously hard to detect, and despite the fact that many telomeric proteins are SUMOylated, Stn1 SUMOylation could not be shown directly by us and others (Hang et al, 2011).

      Reviewer #1 (Recommendations For The Authors):

      Suggestions for improved or additional experiments, data or analyses.

      • My main concern is the claim that SUMOylated PCNA acts as a positive signal for telomerase activation. Yet the pol30-RR mutant has no impact on telomere length. The explanation of the authors is not entirely convincing.

      We are aware that the regulation of telomere length is complex, and we may not fully understand it yet. Just consider the fact that ~500 genes participate in determining the final telomere length of a yeast (Askree et al., 2004). Since mutation in EACH of these genes has a phenotype, the implication is that the joint action of 500 players determines the outcome (a dialogue of 500 participants). Having said this, we clearly show in figure 1 that mutations that prevent PCNA SUMOylation prevent telomere length elongation in cells lacking Elg1, and overexpressing SUMOylated PCNA is enough to elongate the telomeres. Thus, SUMOylation of PCNA does act as a positive signal for elongation.

      However, it appears that to fulfill the minimal requirement of dealing with the "end- replication problem", PCNA SUMOylation is not required, and only a "sustained activity" mode requires the S-PCNA signal (as we have also shown, surprisingly, for RNR1, Maicher et al. 2017). This sustained activity mode depends on Mec1, rather than Tel1 (Harari and Kupiec, 2018). Since elongated telomeres (by absence of Elg1 or by over-expression of SUMO-PCNA) was the phenotype monitored, this may require sustained telomerase activity. Telomere length regulation is complex, and we may not yet understand the whole picture. It appears that for normal “end replication problem” solution, very little telomerase activity may be needed, and spontaneous interactions at a low level may suffice (for example, unmodified PCNA may promote telomerase activity at a lower level than that of SUMO-PCNA. Future work may find the conditions at which telomerase switches from "end replication problem" to "sustained" activity.

      We have added further explanations on this subject to the Discussion section.

      • The model is entitled « Elg1 negatively regulates the telomere length by forming an interaction with the CST complex ». Nevertheless, expression of PCNA-RR completely reversed the long telomere phenotype of elg1∆ cells. Thus it appears that although the interaction between Stn1 and Cdc13 is reduced in the absence of Elg1, Elg1/Stn1 interaction is not instrumental in the formation of the CST complex and thus in the termination of telomerase activity. Does the elg1∆SIM mutant that does not interact with Stn1 impact telomere length?

      • In the model part (lane 318), it is argued that the complex Elg1-Stn1 unloads SUMOylated PCNA. Elg1-Stn1 interaction depends on the SIM of Elg1. This SIM is however not required for Elg1's function in genome-wide SUMO-PCNA unloading, is it required specifically at telomeres?

      The interactions between Elg1 and SUMOylated PCNA are carried out through both the SIM and the Threonines 386 and 387 (Shemesh et al, 2017). Consistently, the single elg1-SIM mutant has telomeres of normal length, and its effects on telomere length can only be seen when combined with mutations in the Threonines (elg1- TT386/7AA or elg1-TT386/7DD). Although the unloading of SUMOylated PCNA by Elg1 is important, the gene is not essential, and PCNA is either eventually unloaded by RFC, or spontaneously dis-assembles. This explains why the telomere length does not reach the same length in the absence of Elg1 as in the absence of, say, Stn1.

      • The model suggests that Elg1 promotes the interaction between Cdc13 and Stn1. This is based on the data presented in Figure 5 E and F. This is an important result. Because the experiment has been done on cells synchronized in S phase and the Elg1/Stn1 interaction occurs specifically at the end of S-phase, the FACS profile should be shown or a control provided to show that the two conditions are comparable.

      The FACS profile for this experiment is shown in Figure 5C.

      • Does the interaction between Cdc13 and Pol30 depend on the SUMOyaltion of POL30 ?

      Yes. We have added this as new Figure S2, and presented the results together with Figure 3 (Figure 3 is already too crowded).

      Others points :

      • Fig 1 : it should be mentioned in the Materials and Methods or in the figure legend how the average telomere lengths (horizontal bar) were calculated from the teloblot, as the position of the bar is not always intuitive

      We estimate telomere length by using TelQuant (Rubinstein et al., 2014). We have added this to the Methods section.

      -Fig 2 : Owing to the large span of telomere length in the stn1 mutants, the epistatic relationship between elg1∆ and stn1 mutants is poorly illustrated by the teloblot.

      We repeated this experiment several times, and stn1 mutants consistently gave a very spread telomere length. In ALL the blots, however, the double mutants elg1 stn1 showed a telomere length similar to that of the single stn1 mutant, and never longer.

      • It is mentioned that other mutants in the collection showed epistasis. Are any of these mutants related to telomere replication or the proposed model?

      Since we used the collection of non-essential mutants (so far), it was quite devoid of genes involved in DNA replication, which are mostly essential. An exception was siz1, which showed epistasis with elg1Δ.

      • The section entitled « Elg1's functional activity is essential for its interaction with Cdc13 » (lane 205) is difficult to follow. The hierarchy between the different mutants of Elg1 on their capacity to unload PCNA is not totally in agreement with the data published in Itzkovich et al 2023 and Shemesh et al. 2017. In particular it appears to me from these papers that elg1-WalkerA 238 (KK343/4AA) mutant did not show a defect in contrast to elg1-WalkerA 238(KK343/4DD).

      We are sorry for the typo in the results. We used the elg1-WalkerA (KK343/4DD) allele, which has a normal SIM but no activity. In a nutshell, we used mutants that either did or did not show unloading activity and/or SIM. The results clearly show that you need to unload PCNA in order for the N-ter of Elg1 to interact with Cdc13.

      • Are the synchronization done at 30{degree sign}C ?

      Yes. We have added the information to the Methods section.

      • ChIP experiments are not described in the Materials and Methods

      We apologize for this. They are now described.

      • In the figure 6, the PCNA rings are curiously placed at the beginning of the Okasaki fragments.

      We thank the referee for noticing, we have corrected the figure.

      Reviewer #2 (Recommendations For The Authors):

      This paper purports to unveil a mechanism controlling telomere length through SUMO modifications controlling interactions between PCNA unloader Elg1 and the CST complex that functions at telomeres. This is an extremely interesting mechanism to understand, and this paper indeed reveals some interesting genetic results, leading to a compelling model, with potential impact on the field. The conclusions are largely supported by experiments examining protein-protein interactions at low resolution and ambiguous regarding directness of interactions like co-IP and yeast two-hybrid (Y2H) combined with genetics. However, some results appear contradictory and there's a lack of rigor in the experimental data needed to support claims. There is significant room for improvement and this work could certainly attain the quality needed to support the claims. The current version needs substantial revision and lacks necessary experimental detail. Stronger support for the claims would add detail to help distinguish competing models.

      Specific comments:

      Insufficient technical detail: I could find no explanation of how overexpression was achieved. No description of how teloChIP is performed, either for the PCNA IP or how the sequence analysis is performed. Too limited details on growth like exact temperatures for the cell cycle time course.

      We have significantly expanded the Methods section to include all the technical information.

      Please do not bold and underline text for emphasis-EVER

      We have removed those from the text.

      Lines 130-132: they have not shown "accumulation of SUMOylated PCNA" anywhere; this is an inference.

      We have modified the text, it says: ”show that SUMOylated PCNA, and not unmodified or ubiquitinated PCNA, is both necessary and sufficient for telomere elongation in the presence or in the absence of Elg1.”

      Fig 2A Can authors show any other very long-telomere mutant like stn1 that does show enhancement in combination with elg1∆ to show feasibility of such phenotype?

      We don't think it is appropriate for the paper, but we have systematically created double mutants with elg1Δ and found many additive and even synergistic interactions. Here is an example. in Author response image 1, taken from the PhD thesis of Taly Ben-Shitrit, a PhD student in the lab.

      Author response image 1.

      What about cdc13 or ten1? Epistatic?

      We did not test telomere length in combination with Ten1. Combining elg1 with cdc13-50 resulted in synergistic elongation. Given the complex genetic relationship between Stn1/Ten1 and Cdc13, it is hard to interpret this result.

      Seems tenuous to use Y2H to decipher protein-protein interactions occurring out of context (i.e., not at telomere but at reporter gene promoter)

      Y2H is a great method to detect interactions, even if they are transient. Whenever possible, we confirm our findings using co-IP or telo-ChIP.

      Lines 268-270: It would be more accurate to state "can be" instead of "becomes" or "is" as they have not shown that SUMOylation or PCNA unloading have occurred.

      We agree, and have changed the text.

      Cdc13snm protein level?

      Unfortunately our Western blot is not presentable, but the level of Cdc13snm was similar to that of the wt Cdc13, and this result has been already published by Hang et al., 2011.

      Fig S3A: If SUMOylated Cdc13 mediates the Stn1-Elg1 interaction, why is Stn1-Elg1 interaction maintained in cdc13snm strain? This result seems to directly contradict the premise and overall conclusion of this section that Cdc13-SUMO mediates the (Y2H) interaction of Elg1 and Stn1.

      According to our model, the interaction between Stn1 and Elg1 takes place upstream, and only then this complex interacts with SUMOylated Cdc13. Hence, if Cdc13 cannot be SUMOylated, the interaction Elg1-Stn1 is not lost, although Stn1 fails to interact with Cdc13, leading to a telomeric phenotype.

      Line 279: which data establishes Stn1-Elg1 interaction as direct? Fig 2B co-Ip indicates physical but not necessarily direct interaction, but later the authors suggest that the interaction requires a SUMOylated intermediary, and Y2H in Fig. S3B doesn't demonstrate direct interaction.

      We have changed the text, taking out the word "direct".

      Co-Ip shows that interaction of Elg1 with Stn1 occurs mainly during later Sphase and with an overall delay compared to initial Elg1-Pol3 interaction.Co-IP Interaction between Cdc13 and Stn1 is reduced in the absence of Elg1

      The subsection title: "The interaction of Elg1 with Stn1 takes place at telomeres only at late S-phase" is not well supported by the data. I agree the data are consistent with the idea of the interactions occurring at telomeres but there's no direct evidence of this.

      We have changed the subsection title. It now reads: " The interaction of Elg1 with Stn1 takes place only at late S-phase"

      Model: Is unloading happening at the fork? Doesn't PCNA unloading have to follow its loading which occurred behind the fork particularly on the lagging strand? Model now suggest that Stn1 itself is SUMOylated.

      Yes, according to the model Elg1 moves with the fork, unloading PCNA from the lagging strand. Once Elg1 reaches the telomeres, it interacts with Stn1 (Figure 5). This interaction requires SUMOylation of Stn1 or of some other protein, which is not PCNA (Figure 3D) nor Cdc13 (Figure S3A) and could be Stn1 itself or another telomeric protein (Hang et al., 2011)

      Title is rather vague.

      We think it summarizes what we present in the paper.

      Abstract:

      "We report that SUMOylated PCNA acts as a signal that positively regulates telomerase activity."

      I don't think this is supported or a good description of what they find

      Figure 1B clearly shows that SUMO-PCNA is both necessary and sufficient for telomere elongation.

      "and dissected the mechanism by which Elg1 and Stn1 negatively regulates telomere elongation, coordinated by SUMO."

      Again, I don't think this is sufficiently supported and the model invokes SUMOylation events not demonstrated like Stn1, which might be a significant step forward.

      On the positive side, their model makes several predictions that they could test much more directly and rigorously: for example, examining the impact of the relevant mutations in the recruitment of proteins to the telomere.

      We have dissected the mechanism, and future work will be devoted to examining the impact of the relevant mutations in the recruitment of proteins to the telomere.

      Reviewer #3 (Recommendations For The Authors):

      Comments:

      1) The telomere length analysis data presented here is consistent with an interpretation that Stn1 and Elg1 play roles in a similar telomere maintenance pathway because the telomere restriction fragment pattern in the double mutants are not longer than the stn1 single mutants. No comment is made with respect to the yellow bars in Figure 2 that presumably measure telomere length appearing to be slightly shorter than in the stn1 single mutants. It may be interesting and informative if the double mutants do in fact have some phenotype distinct from the single stn1 mutants. Is there an impact on viability in the double mutant?

      Given the variable telomeric phenotype of the single stn1 mutants, slight variations in the measurement of the median telomere size are expected. The difference observed is not likely to be significant. What is important is that the double mutants with elg1 do not show longer telomeres. In terms of fitness, the stn1 mutants grow slightly slowly, but the elg1 mutation does not slow them down further.

      2) It is somewhat surprising that no additional telomere length analysis is included that actually tests the proposed model, including whether this path could be operational only under certain conditions. Maybe this is a topic of the next paper?

      Indeed, future work will explore the conditions under which PCNA SUMOylation is essential, and those under which is only needed.

      3) Were the error bars in Figure 5F determined only from the experiment in E? Does this represent error in measuring the data from one biological replicate? The type of error should be made clear to avoid readers assuming the data represents measurements from more than one sample in more than one experiment. The data would be stronger if it represented measurements from multiple experiments.

      The graph was made with data from three biological replicates. We show the best blot in Figure 5E. We have now stressed this in the Figure Legend.

      4) Why was only one two hybrid reporter shown? Having the multiple reporters can give confidence in interactions. (Not a big deal here given the nice co-IP data.)

      We thought that it is enough to show one reporter, as the results with a different reporter (B-gal assay) led to the same conclusions. since this did not add information and made the paper too lengthy (and boring), we took them out. In any case all data was verified by co-IP.

      5) Line 414 - what are the 32P-radio labeled PCR fragments? Are these solely comprised of TG1-3 repeats of some length? A bit more detail in this aspect of the method could be helpful.

      We have added an explanation on the probe in the Methods section.

      6) Line 432-433 - which anti-HA or anti-My antibodies are these? (very minor detail)

      We have added the details.

    2. eLife assessment

      This important study aims to discover the mechanisms governing the switch between conventional DNA replication and the specialized mechanism of telomere end replication. Solid genetic and biochemical assays suggest an interplay between sumoylated PCNA and chromosome terminal capping proteins. The questions addressed have implications for several fields, such as genome stability.

    3. Reviewer #1 (Public Review):

      In this manuscript, the authors investigated the role of Elg1 in the regulation of telomere length. The main role of the Elg1/RLC complex is to unload the processivity factor PCNA, mainly after completion of synthesis of the Okazaki fragment in the lagging strand. They found that Elg1 physically interacts with the CST (Cdc13-Stn1-Ten1) and propose that Elg1 negatively regulates telomere length by mediating the interaction between Cdc13 and Stn1 in a pathway involving SUMOylation of both PCNA and Cdc13. Accumulation of SUMOylated PCNA upon deletion of ELG1 or overexpression of RAD30 leads to elongated telomeres. On the other hand, the interaction of Elg1 with Sten1 is SIM-dependent and occurs concurrently with telomere replication in late S phase. In contrast Elg1-Cdc13 interaction is mediated by PCNA-SUMO, is independent on the SIM of Elg1 but still dependent on Cdc13 SUMOylation. The authors present a model containing two main messages 1) PCNA-SUMO acts as a positive signal for telomerase activation 2) Elg1 promotes Cdc13/Stn1 interaction at the expense of Cdc13/Est1 interaction thus terminating telomerase action.

      The manuscript contains a large amount of data that make a major inroads on a new type of link between telomere replication and regulation of the telomerase. Nevertheless, the detailed choreography of the events as well as the role of PCNA-SUMO remain elusive and the data do not fully explain the role of the Stn1/Elg1 interaction. The data presented do not convincingly support the claim that SUMO-PCNA is a positive signal for telomerase activation. This was partially addressed in the current version.

    4. Reviewer #2 (Public Review):

      This paper purports to unveil a mechanism controlling telomere length through SUMO modifications controlling interactions between PCNA unloader Elg1 and the CST complex that functions at telomeres. This is an extremely interesting mechanism to understand, and this paper indeed reveals some interesting genetic results, leading to a compelling model, with potential impact on the field. Overall, however, the data do not provide sufficient support for the claims. The model may be correct but it is not yet convincingly demonstrated.

      The current version addressed some of the issues regarding language describing conclusions and more experimental detail has been provided. However, the authors have not provided new data supporting the model, so the overall evaluation is that the work remains inconclusive.

    5. Reviewer #3 (Public Review):

      This paper reveals interesting physical connections between Elg1 and CST proteins that suggest a model where Elg1-mediated PCNA unloading is linked to regulation of telomere length extension via Stn1, Cdc13, and presumably Ten1 proteins. Some of these interactions appear to be modulated by sumolyation and connected with Elg1's PCNA unloading activity. The strength of the paper is in the observations of new interactions between CST, Elg1, and PCNA. These interactions should be of interest to a broad audience interested in telomeres and DNA replication.

      What is not well demonstrated from the paper is the functional significance of the interactions described. The model presented by the authors is one interpretation of the data shown, and proposes that the role of sumolyation is temporally regulate the Elg1, PCNA and CST interactions at telomeres. This model makes some assumptions that are not demonstrated by this work (such as Stn1 sumolyation, as noted) and are left for future testing. Alternative models that envision sumolyation as a key in promoting spatial localization could also be proposed based on the data here (as mentioned in the discussion), in addition to or instead of a role for sumolyation in enforcing a series of switches governing a tightly sequenced series of interactions and events at telomeres. Critically, the telomere length data from the paper indicates that the proposed model depicts interactions that are not necessary for telomerase activation or inhibition, as telomeres in pol30-RR strains are normal length and telomeres in elg1∆ strains are not nearly as elongated as in stn1 strains. One possibility mentioned in the paper is the PCNAS and Elg1 interactions are contributing to the negative regulation of telomerase under certain conditions that are not defined in this work. Could it also be possible that the role of these interactions is not primarily directed toward modulating telomerase activity? It will be of interest to learn more about how these interactions and regulation by Sumo function intersect with regulation of telomere extension.

    1. eLife assessment

      This paper is a valuable step in multi-subject behavioral modeling using an extension of the Variational Autoencoder (VAE) framework. Using a novel partition of the latent space and in tandem with a recently proposed regularization scheme, the paper provides a rich set of computational analyses analyzing social behavior data of mice with results that represent the state-of-the-art in this subfield. The strength of evidence is convincing, with the methodology being well documented and the results being reproducible, although some additional quantifications would have been useful to fully gauge the circumstances where the approach would be most effectively applied.

    2. Reviewer #1 (Public Review):

      In this manuscript, the authors present a valuable new method to represent animal behavior from video data using a variational autoencoder framework that disentangles individual-specific and background variance from variables that can be more reliably compared across individuals. They achieve this aim through the use of a novel Cauchy-Schwatz (C-S) regularization term in their loss function that leads to latents that model continuously varying features in the images. The authors present a variety of validations for the method, including testing across sessions and individuals for a head-fixed task. They also show how the methods could be used for behavioral decoding from neural data, quantifying social behavior in mice, demonstrating the applicability of the method outside of head-fixed environments and for different measurement modalities. While some areas of confusion and questions about the validation exist, this is an overall strong paper and an important contribution to this field.

      Strengths:

      - The use of the C-S regularizer is novel approach that has potential for wide use across experimental paradigms and model organisms<br /> - The extent of the validations performed was solid, although perhaps not as convincing in a couple of cases as might be ideal<br /> - The GitHub code demo worked well, and the code appears to be accessible and well-written

      Weaknesses:

      - Some of the validation figures were a bit unclear in their presentation, making it difficult to assess exactly what had been tested<br /> - It is possible that I missed this, but the authors didn't really provide a sense of how to pick a particular distribution to match using the CS term for a specific paradigm/modality and how the choice affects the results<br /> - While the authors' statements about individual training vs. transfer learning accuracy and efficiency in Figure 6 are technically true, the effect size is rather small ( a few percent at most in each case), thus I don't know how much of a big deal I would want to make out of these results<br /> - In general, I would have liked to have seen the Discussion section speak more to the choices and limitations inherent in applying the method. How does the choice of prior/metaparameters/architecture/etc affect the results? In what situations would this method to fail? What are the next advances that are necessary for the field to progress?

    3. Reviewer #2 (Public Review):

      This paper presents a valuable contribution to ongoing methods for understanding and modeling structure via latent variable models for neural and behavioral data. Building on the PS-VAE model of Whiteway et al. (2021), which posited a division of latent variables into unsupervised (i.e., useful for reconstruction) and supervised (useful for predicting selected labeled features) variables, the authors propose an additional set of "constrained subspace" latent variables that are regularized toward a prespecified prior via a Cauchy-Schwarz divergence previously proposed.

      The authors contend that the added CS latents aid in capturing both patterns of covariance across the data and individual-specific features that are of particular benefit in multi-animal experiments, all without requiring additional labels. They substantiate these claims with a series of computational experiments demonstrating that their CS-VAE outperforms the PS-VAE in several tasks, particularly that of capturing differences between individuals, consistency in behavioral phenotyping, and predicting correlations with neural data.

      Strengths of the present work include an extensive and rigorous set of validation experiments that will be of interest to those analyzing behavioral video. Weaknesses include a lack of discussion of key theoretical ideas motivating the design of the model, including the choice of a Cauchy-Schwarz divergence, the specific form of the prior, and arguments for sorts of information the CS latents might capture and why. In addition, the model makes use of a moderate number of key hyperparameters whose effect on training outcomes are not extensively analyzed. As a result, the model may be difficult for less experienced users to apply to their own data. Finally, as with many similar VAE approaches, the lack of a ground truth against which to validate means that much of evidence provided for the model is necessarily subjective, and its appeal lies in the degree to which the discovered latent spaces appear interpretable in particular applications.

      In all, this work is a valuable contribution that is likely to have appeal to those interested in applying latent space methods, particularly to multi-animal video data.

    4. Reviewer #3 (Public Review):

      As naturalistic neuroscience becomes increasingly popular, the importance of new computational tools that facilitate the study of animals behaving in minimally constrained environments grows. Yi et al convincingly demonstrate the usefulness of their new method on data from neuroethological studies involving multiple animals, including those with social interactions. Briefly, their method improves upon prior semi-supervised machine learning methods in that extracted latent variables can be more cleanly separated into those representing the behavior of individual subjects and those representing social interactions between subjects. Such an improvement is broadly useful for downstream analysis tasks in multi-subject or social neuroethological studies.

      Strengths:<br /> The authors tackle an important problem encountered in behavior analyses in an emerging subfield of neuroscience, naturalistic social neuroscience. They make a case for doing so using semi-unsupervised methods, a toolbox which balances competing scientific needs for building models using large neural-behavioral datasets and for model explainability. The paper is well written, with well-designed figures and relevant analyses that make for an enjoyable reading experience.

      The authors provide a remarkable variety of examples that make a convincing case for the utility of their method when used by itself or in conjunction with other data analysis techniques commonly used in modern neuroscience (behavioral motif extraction, neural decoding, etc.). The examples show not just that the extracted latents are more disentangled, but also that the improvement in disentangling has positive effects in downstream analysis tasks.

      Weaknesses:<br /> While the paper does a great job of applying the method to real world data, the components of the method itself are not as thoroughly investigated. For example, the contribution of the novel Cauchy-Schwarz regularization technique has not been systematically investigated. This could be done either by sharing additional data where hyperparameters control the contribution of the regularizer, or cite relevant papers where such an analysis have been carried out. It would also be valuable to understand what other regularization techniques might potentially have been applicable here.

      The authors conclude from their empirical investigations that the specific prior distribution does not matter to the regularization process. This seems reasonable given that the neural network can learn a complex and arbitrary transformation of the data during training. It would be helpful if the authors could cite prior work where this type of prior distribution does matter and how their approach is different from such prior work. If there is a visualization/explainability related motivation for choosing one prior distribution over another, this could be clarified.

    1. eLife assessment

      This important study provides a framework bearing on the role of Eph-Ephrin signaling mechanisms in the clinically condition of amyotropic lateral sclerosis. It provides compelling evidence for the roles of glial cells in this condition. This novel astrocyte-mediated mechanism may help identify future therapeutic targets.

    2. Reviewer #1 (Public Review):

      In the manuscript by Urban et al., the authors attempt to further delineate the role which non-neuronal CNS cells play in the development of ALS. Toward this goal, the transmembrane signaling molecule ephrinB2 was studied. It was found that there is an increased expression of ephrinB2 in astrocytes within the cervical ventral horn of the spinal cord in a rodent model of ALS. Moreover, the reduction of ephrinB2 reduced motoneuron loss and prevented respiratory dysfunction at the NMJ. Further driving the importance of ephrinB2 is an increased expression in the spinal cords of human ALS individuals. Collectively, these findings present compelling evidence implicating ephrinB2 as a contributing factor towards the development of ALS.

    3. Reviewer #2 (Public Review):

      The contribution of glial cells to the pathogenesis of amyotrophic lateral sclerosis (ALS) is of substantial interest and the investigators have contributed significantly to this emerging field via prior publications. In the present study, authors use a SOD1G93A mouse model to elucidate the role of astrocyte ephrinB2 signaling in ALS disease progression. Erythropoietin-producing human hepatocellular receptors (Ephs) and the Eph receptor-interacting proteins (ephrins) signaling is an important mediator of signaling between neurons and non-neuronal cells in the nervous system. Recent evidence suggests that dysregulated Eph-ephrin signaling in the mature CNS is a feature of neurodegenerative diseases. In the ALS model, upregulated Eph4A expression in motor neurons has been linked to disease pathogenesis. In the present study, authors extend previous findings to a new class of ephrinB2 ligands. Urban et al. hypothesize that upregulated ephrinB2 signaling contributes to disease pathogenesis in ALS mice. The authors successfully test this hypothesis and their results generally support their conclusion.

      Major strengths of this work include a robust study design, a well-defined translational model, and complementary biochemical and experimental methods such that correlated findings are followed up by interventional studies. Authors show that ephrinB2 ligand expression is progressively upregulated in the ventral horn of the cervical and lumbar spinal cord through pre-symptomatic to end stages of the disease. This novel association was also observed in lumbar spinal cord samples from post-mortem samples of human ALS donors with a SOD1 mutation. Further, they use a lentiviral approach to drive knock-down of ephrinB2 in the central cervical region of SOD1G93A mice at the pre-symptomatic stage. Interestingly, in spite of using a non-specific promoter, authors note that the lentiviral expression was preferentially driven in astrocytes.

      Since respiratory compromise is a leading cause of morbidity in the ALS population, the authors proceed to characterize the impact of ephrinB2 knockdown on diaphragm muscle output. In mice approaching the end stage of the disease, electrophysiological recordings from the diaphragm muscle show that animals in the knock-down group exhibited a ~60% larger amplitude. This functional preservation of diaphragm function was also accompanied by the preservation of diaphragm neuromuscular innervation. However, it must be noted that this cervical ephrinB2 knockdown approach had no impact on disease onset, disease duration, or animal survival. Furthermore, there was no impact of ephrinB2 knockdown on forelimb or hindlimb function.

      The major limitation of the manuscript as currently written is the conclusion that the preservation of diaphragm output following ephrinB2 knockdown in SOD1 mice is mediated primarily (if not entirely) by astrocytes. The authors present convincing evidence that a reduction in ephrinB2 is observed in local astrocytes (~56% transduction) following the intraspinal injection of the lentivirus. However, the proportion of cell types assessed for transduction with the lentivirus in the spinal cord was limited to neurons, astrocytes, and oligodendrocyte lineage cells. Microglia comprise a large proportion of the glial population in the spinal grey matter and have been shown to associate closely with respiratory motor pools. This cell type, amongst the many others that comprise the ventral gray matter, have not been investigated in this study. Thus, the primary conclusion that astrocytes drive ephrinB2-mediated pathogenesis in ALS mice is largely correlative. Further, it is interesting to note that no other functional outcomes were improved in this study. The C3-C5 region of the spinal cord consists of many motor pools that innervate forelimb muscles. CMAP recordings conducted at the diaphragm are a reflection of intact motor pools. This type of assessment of neuromuscular health is hard to re-capitulate in the kind of forelimb task that is being employed to test motor function (grip strength). Thus, it would be interesting to see if CMAP recordings of forelimb muscles would capture the kind of motor function preservation observed in the diaphragm muscle.

      On a similar note, the functional impact of increased CMAP amplitude has not been presented. An increase in CMAP amplitude does not necessarily translate to improved breathing function or overall ventilation. Thus, the impact of this improvement in motor output should be clearly presented to the reader. Further, to the best of my knowledge, expression of Eph (or EphB) receptors has not been explicitly shown at the phrenic motor pool. It is thus speculative at best that the mechanism that the authors suggest in preserving diaphragm function is in fact mediated through Eph-EphrinB2 signaling at the phrenic motor pool. This aspect of the study would warrant a deeper discussion. Lastly, although authors include both male and female animals in this investigation, they do not have sufficient power to evaluate sex differences. Thus, this presents another exciting future of investigation, given that ALS has a slightly higher preponderance in males as compared to females.

      In summary, this study by Urban et al. provides a valuable framework for Eph-Ephrin signaling mechanisms imposing pathological changes in an ALS mouse model. The role of glial cells in ALS pathology is a very exciting and upcoming field of investigation. The current study proposes a novel astrocyte-mediated mechanism for the propagation of disease that may eventually help to identify potential therapeutic targets.

    1. eLife assessment

      This is an important study that revealed a new noncoding RNA regulatory circuit involved in T cell function. The authors provide compelling evidence, that is more rigorous than the state-of-the-art, using genetically engineered mice and cell-based experiments. The interpretation of the results should be tempered due to the small effect size observed.

    2. Reviewer #1 (Public Review):

      Wheeler et al. have discovered a new RNA circuit that regulates T-cell function. They found that the long non-coding RNA Malat1 sponges miR-15/16, which controls many genes related to T cell activation, survival, and memory. This suggests that Malat1 indirectly regulates T-cell function. They used CRISPR to mutate the miR-15/16 binding site in Malat1 and observed that this disrupted the RNA circuit and impaired cytotoxic T-cell responses. While this study presents a novel molecular mechanism of T-cell regulation by Malat1-miR-15/16, the effects of Malat1 are weaker compared to miR-15/16. This could be due to several reasons, including higher levels of miR-15/16 compared to Malat1 or Malat1 expression being mostly restricted to the nucleus. Although the role of miR15/16 in T-cell activation has been previously published, if the authors can demonstrate that miR15/16 and/or Malat1 affect the clearance of Listeria or LCMV, this will significantly add to the current findings and provide physiological context to the study.

    3. Reviewer #2 (Public Review):

      This study connects prior findings on MicroRNA15/16 and Malat1 to demonstrate a functional interaction that is consequential for T cell activation and cell fate.

      The study uses mice (Malat1scr/scr) with a precise genetic modification of Malat1 to specifically excise the sites of interaction with the microRNA, but sparing all other sequences, and mice with T-cell specific deletion of miR-15/16. The effects of genetic modification on in vivo T-cell responses are detected using specific mutations and shown to be T-cell intrinsic.

      It is not known where in the cell the consequential interactions between MicroRNA15/16 and Malat1 take place. The authors depict in the graphical abstract Malat1 to be a nuclear lncRNA. Malat 1 is very abundant, but it is unclear if it can shuttle between the nucleus and cytoplasm. As the authors discuss future work defining where in the cell the relevant interactions take place will be important.

      In addition to showing physiological phenotypic effects, the mouse models prove to be very helpful when the effects measured are small and sometimes hard to quantitate in the context of considerable variation between biological replicates (for example the results in Figure 4D).

      The impact of the genetic modification on the CD28-IL2- Bcl2 axis is quantitatively small at the level of expression of individual proteins and there are likely to be additional components to this circuitry.

    1. Reviewer #1 (Public Review):

      More than ten years ago, it was shown that activity in the primary visual cortex of mice substantially increases when mice are running compared to when they are sitting still. This finding 'revolutionised' our thinking about the visual cortex, turning away from it being a passive image processor and highlighting the influence of non-visual factors. The current study now for the first time repeats this experiment in a primate (the marmoset). The authors find that in contrast to mice, marmoset V1 activity is slightly suppressed during running, and they relate this to differences in gain modulations of V1 activity between the two species.

      Strengths:

      - Replication in primates of the original finding in mice partly took so long, because of the inherent difficulties with recording from the brain of a running primate. The treadmill for the marmosets in the current study is a very elegant solution to this problem. It allows for true replication of the 'running vs stationary' experiment and undoubtedly opens up many possibilities for other experiments recording from a head-fixed but active marmoset.<br /> - In addition to their own data on the marmoset, the authors run their analyses on a publicly available data set on the mouse. This allows them to directly compare mouse and marmoset findings, which significantly strengthens their conclusions.

      Weaknesses:

      - The main thing that is missing from the study is a good explanation as to why running has such a different effect on marmoset V1 compared to mouse V1. Differences in neuromodulatory inputs are cited in the discussion as a possible cause for the discrepancy, but an obvious influencing factor that the authors could investigate in their own data set is the retinal input. In Fig1b, the authors even show these data in the form of gaze and pupil size. In these example data, by eye, it looks like the pupil size is positively correlated with the run speed. This would of course have large consequences on the activity in V1, but the authors do not do anything with these data. The study would improve substantially if the authors would correlate their run speed traces with other factors that they have recorded too, such as pupil size and gaze.

      - Fig2a shows the 'most correlated mouse session', i.e. the session where the relation between visual cortex activity and running speed was strongest. Looking at the raster plot, however, shows that this strong positive correlation must be due entirely to the lower half of the neurons significantly increasing their firing rate as the mouse starts to run; in fact, the upper 25% or so of the neurons show exactly the opposite (strong suppression of the neurons as the mouse starts running). It would be more balanced if this heterogeneity in the response is at least mentioned somewhere in the text.

      Significance:

      The paper provides interesting new evidence to the ongoing discussion about the influence of non-visual factors in general, and running in particular, on visual cortex activity. As such, it helps to pull this discussion out of the rodent field mainly and into the field of primate research. The elegant experimental set-up of the marmoset on a treadmill will certainly add new findings to this issue also in the years to come.

    2. eLife assessment

      This important work advances our understanding of the differences in locomotion-induced modulation in primate and rodent visual cortexes. The evidence in support of these differences across species is convincing, although greater use of the primate dataset with some additional analyses would have strengthened the claims. This work will be of broad interest to neuroscientists.

    3. Reviewer #2 (Public Review):

      This work aims at answering whether activity in the primate visual cortex is modulated by locomotion, as was reported for the mouse visual cortex. The finding that the activity in the mouse visual cortex is modulated by running has changed the concept of primary sensory cortical areas. However, it was an open question whether this modulation generalizes to primates.

      To answer this fundamental question the authors established a novel paradigm in which a head-fixed marmoset was able to run on a treadmill while watching a visual stimulus on a display. In addition, eye movements and running speed were monitored continuously and extracellular neuronal activity in the primary visual cortex was recorded using high-channel-count electrode arrays. This paradigm uniquely permitted investigation of whether locomotion modulates sensory-evoked activity in the visual cortex of a marmoset. Moreover, to directly compare the responses in the marmoset visual cortex to responses in the mouse visual cortex the authors made use of a publicly-available mouse dataset from the Allen Institute. In this dataset, the mouse was also running on a treadmill and observing a set of visual stimuli on a display. The authors took extra care to have the marmoset and mouse paradigms as comparable as possible.

      To characterize the visually driven activity the authors present a series of moving gratings and estimate receptive fields with sparse noise. To estimate the gain modulation by running the authors split the dataset into epochs of running and non-running which allowed them to estimate the visually evoked firing rates in both behavioral states.

      Strengths:<br /> The novel paradigm of head-fixed marmosets running on a treadmill while being presented with a visual stimulus is unique and ideally tailored to answering the question that the authors aimed to answer. Moreover, the authors took extra care to ensure that the paradigm in the marmoset matched as closely as possible to the conditions in the mouse experiments such that the results can be directly compared. To directly compare their data the authors re-analyzed publicly available data from the visual cortex of mice recorded at the Allen Institute. Such a direct comparison, and reuse of existing datasets, is another strong aspect of the work. Finally, the presented new marmoset dataset appears to be of high quality, the comparison between the mouse and marmoset visual cortex is well done and the results and interpretation are straightforward.

      Weaknesses:<br /> While the presented results are clear and support the main conclusion of the authors, additional analysis and experimental details could have further strengthened and clarified some aspects of the results. For example, it is known that the locomotion gain modulation varies with layer in the mouse visual cortex, with neurons in the infragranular layers expressing a diversity of modulations (Erisken et al. 2014 Current Biology). However, for the marmoset dataset, it was not reported from which cortical layer the neurons are from, leaving this point unanswered.

      Nonetheless, the aim of comparing the locomotion-induced modulation of activity in primate and mouse primary visual cortex was convincingly achieved by the authors. The results shown in the figures support the conclusion that locomotion modulates the activity in primate and mouse visual cortex differently. While mice show a profound gain increase, neurons in the primate visual cortex show little modulation or even a reduction in response strength.

      This work will have a strong impact on the field of visual neuroscience but also on neuroscience in general. It revives the debate of whether results obtained in the mouse model system can be simply generalized to other mammalian model systems, such as non-human primates. Based on the presented results, the comparison between the mouse and primate visual cortex is not as straightforward as previously assumed. This will likely trigger more comparative studies between mice and primates in the future, which is important and absolutely needed to advance our understanding of the mammalian brain.

      Moreover, the reported finding that neurons in the primary visual cortex of marmosets do not increase their activity during running is intriguing, as it makes you wonder why neurons in the mouse visual cortex do so. The authors discuss a few ideas in the paper which can be addressed in future experiments. In this regard, it is worth noting that the authors report an interesting difference between the foveal and peripheral parts of the visual cortex in marmoset. It will be interesting to investigate these differences in more detail in future studies. Likewise, while running might be an important behavioral state for mice, other behavioral states might be more relevant for marmosets and do modulate the activity of the primate visual cortex more profoundly. Future work could leverage the opportunities that the marmoset model system offers to reveal new insights about behavioral-related modulation in the primate brain.

    4. Reviewer #3 (Public Review):

      Prior studies have shown that locomotion (e.g., running) modulates mouse V1 activity to a similar extent as visual stimuli. However, it's unclear if these findings hold in species with more specialized and advanced visual systems such as nonhuman primates. In this work, Liska et al. leverage population and single neuron analyses to investigate potential differences and similarities in how running modulates V1 activity in marmosets and mice. Specifically, they discovered that although a shared gain model could describe well the trial-to-trial variations of population-level neural activity for both species, locomotion more strongly modulated V1 population activity in mice. Furthermore, they found that at the level of individual units, marmoset V1 neurons, unlike mice V1 neurons, experience suppression of their activity during running.

      A major strength of this work is the introduction and completion of primate electrophysiology recordings during locomotion. Data of this kind was previously limited, and this work moves the field forward in terms of data collection in a domain previously inaccessible in primates. Another core strength of this work is that it adds to a limited collection of cross-species data collection and analysis of neural activity at the single-unit and population level, attempting to standardize analysis and data collection to be able to make inferences across species.

      However, the authors did not take full advantage of the quantity and diversity of the marmoset visual cortex recordings in their analyses. They mention recording and analyzing the activity of peripheral V1 neurons but mainly present results involving foveal V1 neurons. Foveal neurons, with their small receptive fields strongly affected by precise eye position, would seem to be less likely to be comparable to rodent data. If the authors have a reason for not doing so, they should provide an explanation. Given that the marmosets are motivated to run with liquid rewards, the authors should provide more context as to how this may or may not affect marmoset V1 activity. Additionally, the lack of consideration of eye movements or position presents a major absence for the marmoset results, and fails to take advantage of one of the key differences between primate and rodent visual systems - the marmosets have a fovea, and make eye movements that fixate in various locations on the screen during the task. Finally, the model provides a strong basis for comparison at the level of neuronal populations, but some methodological choices are insufficiently described and may have an impact on interpreting the claims.

      Overall, the methods and data are supportive of the main claims of the work. The use of single neuron and population level approaches demonstrate that the activity of V1 in mice and marmoset is categorically different. Since primate V1 is so diverse, this limits the interpretation of the cross-species comparison. Still, the work is a great step forward in the field, especially with the novel methodology of collecting neural activity from running primates.

    1. Reviewer #2 (Public Review):

      The pear psylla Cacopsylla chinensis has two morphologically different forms, winter- and summer-forms depending on the temperatures. The authors provided solid data showing that the cold sensor CcTRPM is responsible for switching summer- to winter forms, which is in turn regulated by the miRNA miR-252. This finding is interesting and novel.

    2. eLife assessment

      This is a valuable study of the molecular basis of summer-to-winter transition in Cacopsylla chinensis. Despite the convincing molecular and organism-level experiments, evidence of cold sensitivity in the protein of interest is incomplete, with a lack of methodological detail. The results of this study will be of interest to entomologists.

    3. Reviewer #1 (Public Review):

      Here, the authors describe, in detail, the transition between the summer form and the winter form of the pear psyllid, Cacopsylla chinensis. While the authors explore many components of this transition, the central hypotheses they seek to test are (i) that a protein they deem CcTRPM is a cold-sensitive Transient Receptor Potential Melastatin (TRPM) channel, and (ii) that this channel is involved in the summer-to-winter transition, in response to cold.

      The authors demonstrate that: both cold and menthol can initiate the summer-to-winter transition; that the protein of interest is required for the summer-to-winter transition (in vivo); that the protein of interest is involved in menthol-dependent Ca2+ transients (in vitro); that miR-252 expression is temperature-dependent, modulates the seasonal transition, and affects the expression of the transcript of interest; and finally, somewhat separately, that the chitin biosynthesis pathway is linked to the summer-to-winter transition.

      Although I generally found the evidence to be convincing, I note a few critical weaknesses in the manuscript as it is currently presented. Firstly, there is insufficient methodological detail to understand how the genes/transcripts/proteins in this work were identified. Further, the structural and phylogenetic analyses are incompletely described and the results are inconsistent with our previous understanding of the structure and evolution of TRPMs. It is thus possible (although unlikely) that this protein has been misidentified. Alternatively, this could be a structurally aberrant TRPM from a lineage previously presumed to be lost in insects, but there is not sufficient evidence to conclude this. Perhaps more importantly, the authors conclude that the protein of interest is cold sensitive (i.e., a "temperature receptor") primarily based on menthol sensitivity. Although menthol and cold activate the same receptors in other species, there is no demonstrated reason to think that menthol sensitivity necessitates cold sensitivity, or vice-versa. Thus, the authors' conclusions are, in my opinion, incomplete and overstated. Below are specific comments giving further context to the criticisms summarized above:

      1. The method used to identify the various genes/proteins described herein is not described. Relatedly, the alignment in Figure S1 lacks Trpms from non-hemipteran taxa, making it difficult to judge sequence similarity to other more well-characterized Trpms (e.g., from human, mouse, fly, nematode, etc.), and thus difficult to assess homology from the manuscript alone.

      2. The authors suggest that the CcTrpm has ankyrin repeats. To my knowledge, this would be the first description of ankyrin repeats in TRPM. It's not stated how the authors identified these putative ankyrin repeats. There's also no description of the absence or presence of previously identified Melastatin Homology Regions (MHRs), a C-terminal coiled-coil that is typically present, other C-terminal domain motifs, or the TRP domain. In the absence of methodological detail, and given the proposed presence of ankyrin repeats, it seems possible that this may not be TRPM.

      3. The authors suggest that, because mRNA abundance for CcTRPM is increased in response to cold, it is cold-sensitive. However, this says nothing as to whether cold actually activates the ion channel -- a critical distinction. The authors finally conclude that CcTRPM encodes a cold-sensitive ion channel because menthol elicits Ca2+ activity in vitro. However, this experiment only demonstrates that the protein is likely menthol sensitive. This experiment does not support the authors' conclusion that this is a cold-sensitive receptor (although their later knockdown experiments do, albeit indirectly).

      4. The lack of taxonomic representation in the phylogenetic analysis makes it difficult to interpret, especially in the context of methodological detail concerning the initial identification of the gene/transcript/protein of interest. Further, it's not stated if the tree is rooted (if it is, the rooting methodology is not described), the branch lengths are not shown, and the branch support methodology is not described. Many previous phylogenetic analyses have concluded--implicitly or explicitly--that there are at least two ancestral animal TRPM paralogs. From the perspective of vertebrates, one ancestral copy went on to diversify into TRPMs 1, 3, 6, and 7, and the other ancestral copy went on to diversify into TRPMs 2, 8, 4, and 5. The insect Trpms are generally thought to be more closely related to vertebrate TRPMs 1,3,6, and 7. If this phylogeny is rooted, it implies that the hemipteran Trpms are more closely related to vertebrates 2, 8, 4, and 5 (or at least 8, since that is all that is present here), and quite distantly related to other insect Trpms (and presumably, to vertebrates 1,3,6, and 7, which are not present). To my knowledge, this would be the first description of this Trpm subfamily in insects, but there is insufficient evidence or phylogenetic rigor here to conclude that. The most likely explanation is that the tree is unrooted, incorrectly rooted, or that the protein of interest is not TRPM.

    1. eLife assessment

      This valuable Tools and Resources paper presents new tools for investigating GLP-1 signaling: a genetically-encoded sensor constructed from a mutated GLP1R receptor as well as a caged agonist peptide. The evidence for these tools working as advertised is largely convincing and they may be helpful for screening compounds that bind to GLP1R. On the other hand, their overall utility is limited by their very weak apparent affinity relative to the likely biological concentration and response. Incomplete characterization of the properties of the tools makes it difficult to anticipate which applications are most likely to succeed.

    1. eLife assessment

      This valuable Tools and Resources paper presents new tools for investigating GLP-1 signaling: a genetically-encoded sensor constructed from a mutated GLP1R receptor as well as a caged agonist peptide. The evidence for these tools working as advertised is solid and they may be helpful for screening compounds that bind to GLP1R.

    1. Reviewer #1 (Public Review):

      The study titled "Distinct states of nucleolar stress induced by anti-cancer drugs" by Potapova and colleagues demonstrates that different chemotherapeutic agents can induce nucleolar stress, which manifests with varying cellular and molecular characteristics. The study also proposes a mechanism for how a novel type of nucleolar stress driven by CDK inhibitors may be regulated. As a reviewer, I appreciate the unbiased screening approach and I am enthusiastic about the novel insights into cell biology and the implications for cancer research and treatment. The study has several significant strengths: i) it highlights the understudied role of nucleolar stress in the on- and off-target effects of chemotherapy; ii) it defines novel molecular and cellular characteristics of the different types of nucleolar stress phenotypes; iii) it proposes novel modes of action for well-known drugs.

      However, there are several important points that should be addressed:<br /> • The rationale behind choosing RPE cells for the screen is unclear. It might be more informative to use cancer cells to study the effects of chemotherapeutic agents. Alternatively, were RPE cells selected to evaluate the side effects of these agents on normal cells? Clarifying these points in the introduction and discussion would guide the reader.<br /> • Figure 2F indicates that DLD1 and HCT116 cells are less sensitive to nucleolar changes induced by several inhibitors, including CDK inhibitors. It would be crucial to correlate these differences with cell viability. Are these differences due to cell-type sensitivity or variations in intracellular drug levels? Assessing cell viability and intracellular drug concentration for the same drugs and cells would provide valuable insights.<br /> • Have the authors interpreted nucleolar stress as the primary cause of cell death induced by these drugs? When cells treated with CDK inhibitors exhibit the dissociated nucleoli phenotype, is this effect reversible? Is this phenotype indicative of cell death commitment? Conducting a washout experiment to measure the recovery of nucleolar function and cell viability would address these questions.<br /> • The correlation between the loss of Treacle phosphorylation and nucleolar stress upon CDK inhibition is intriguing. However, it remains unclear how these two events are related. Would Treacle knockdown yield the same nucleolar phenotype as CDK inhibition? Moreover, would point mutations that abolish Treacle phosphorylation prevent its interaction with Pol-I? Experiments addressing these questions would enhance our understanding of the correlation/causation between Treacle phosphorylation and the effects of CDK inhibition on nucleolar stress.

      Overall, this study is significant and novel as it sheds light on the importance of nucleolar stress in defining the on-target and off-target effects of chemotherapy in normal and cancer cells.

    2. eLife assessment

      This study and associated data is compelling, novel, important, and well-carried out. The study demonstrates a novel finding that different chemotherapeutic agents can induce nucleolar stress, which manifests with varying cellular and molecular characteristics. The study also proposes a mechanism for how a novel type of nucleolar stress driven by CDK inhibitors may be regulated. The study sheds light on the importance of nucleolar stress in defining the on-target and off-target effects of chemotherapy in normal and cancer cells.

    3. Reviewer #2 (Public Review):

      This is an interesting study with high-quality imaging and quantitative data. The authors devise a robust quantitative parameter that is easily applicable to any experimental system. The drug screen data can potentially be helpful to the wider community studying nucleolar architecture and the effects of chemotherapy drugs. Additionally, the authors find Treacle phosphorylation as a potential link between CDK9 inhibition, rDNA transcription, and nucleolar stress. Therefore I think this would be of broad interest to researchers studying transcription, CDKs, nucleolus, and chemotherapy drug mechanisms. However, the study has several weaknesses in its current form as outlined below.

      1. Overall the study seems to suffer from a lack of focus. At first, it feels like a descriptive study aimed at characterizing the effect of chemotherapy drugs on the nucleolar state. But then the authors dive into the mechanism of CDK inhibition and then suddenly switch to studying biophysical properties of nucleolus using NPM1. Figure 6 does not enhance the story in any way; on the contrary, the findings from Fig. 6 are inconclusive and therefore could lead to some confusion.

      2. The justification for pursuing CDK inhibitors is not clear. Some of the top hits in the screen were mTOR, PI3K, HSP90, Topoisomerases, but the authors fail to properly justify why they chose CDKi over other inhibitors.

      3. In addition to poor justification, it seems like a very superficial attempt at deciphering the mechanism of CDK9i-mediated nucleolar stress. I think the most interesting part of the study is the link between CDK9, Pol I transcription, and nucleolar stress. But the data presented is not entirely convincing. There are several important controls missing as detailed below.

      4. The authors did not test if inhibition of CDK7 and/or CDK12 also induces nucleolar stress. CDK7 and CDK12 are also major kinases of RNAPII CTD, just like CDK9. Importantly, there are well-established inhibitors against both these kinases. It is not clear from the text whether these inhibitors were included in the screen library.

      5. In Figure 4E, the authors show that Pol I is reduced in nucleolus/on rDNA. The authors should include an orthogonal method like chromatin fractionation and/or ChIP

      6. In Fig. 5D, in vitro kinase lacks important controls. The authors should include S to A mutants of Treacle S1299A/S1301A to demonstrate that CDK9 phosphorylates these two residues specifically.

      7. To support their model, the authors should test if overexpression of Treacle mutants S1299A/S1301A can partially phenocopy the nucleolar stress seen upon CDK9 inhibition. This would considerably strengthen the author's claim that reduced Treacle phosphorylation leads to Pol I disassociation from rDNA and consequently leads to nucleolar stress.

      8. Additionally, it would be interesting if S1299D/S1301D mutants could partially rescue CDK9 inhibition.

    1. eLife assessment

      This study examines the role of the locus coeruleus in the extinction of instrumental behaviors. The work is valuable to highlight the function of this part of the brain. Evidence is incomplete as spontaneous recovery is not demonstrated in control subjects. Further analyses on the effect of the manipulations on baseline lever pressing, magazine entries, and the coupling of lever-presses and magazine entries help capture the function of the locus coeruleus.

    2. Reviewer #1 (Public Review):

      In this paper by Lui and colleagues, the authors examine the role of locus coeruleus (LC)-noradrenaline (NA) neurons in the extinction of appetitive instrumental conditioning. They report that optogenetic activation of global LC-NA neurons during the conditioned stimulus (CS) period of extinction enhances long-term extinction memory without affecting within-session extinction. In contrast, LC-NA activation during the intertrial interval doesn't affect extinction and long-term memory. They then show that optogenetic activation of LC-NA neurons doesn't induce conditioned place preference/avoidance. Finally, they assess the necessity of LC-NA neurons in appetitive extinction and find that optogenetic inactivation of LC-NA neurons during the CS period results in the enhancement of within-session extinction. The experiments are well-designed, including offset control in the optogenetic activation study. I think this study adds new insight into the LC-NA system in the context of appetitive extinction.

      Strengths:<br /> ・These studies identify that the artificial activation of LC-NA neurons enhances long-term memory of appetitive extinction, while this activation can't induce long-term conditioned place aversion. Thus, optogenetic activation of LC-NA neurons can inhibit spontaneous recovery of appetitive extinction without causing long-term aversive memory.<br /> ・Optoinhibition study demonstrates the reduction of a conditioned response of within-session extinction. Therefore, LC-NA neuronal activity at the CS period of extinction could act as anti-extinction or be important for the expression of the conditioned response.

      Weaknesses:<br /> ・It is unclear how LC-NA neurons behave during the CS period of appetitive extinction from this study. This weakens the importance of the optogenetic inactivation result.<br /> ・While authors manipulate global LC-NA neurons, many people find functionally heterogeneous populations in the LC. It remains unsolved if there is a specific LC-NA subpopulation responsible for appetitive extinction.

    3. Reviewer #2 (Public Review):

      This study examines the role of the Locus Coeruleus (LC)/noradrenergic (NA) system in extinction in male and female rats. The behavioural task involves three phases i) training on a discriminative procedure in which operant responding was rewarded only during the presentation of a stimulus ii) extinction iii) testing for the expression of extinction at both short (1 day) or long (7 days) delays. Targeting LC/NA cells with optogenetic in TH::Cre rats, the authors found that photoexcitation during extinction led to an increase in the expression of extinguished responding at both short and long delays. By contrast, photo inhibition was found to be without an effect.

      1. In such discrimination training, Pavlovian (CS-Food) and instrumental (LeverPress-Food) contingencies are intermixed. It would therefore be very interesting if the authors provided evidence of other behavioural responses (e.g. magazine visits) during extinction training and tests.<br /> 2. In Figure 1, the authors show the behavioural data of the different groups of control animals which were later collapsed in a single control group. It would be very nice if the authors could provide the data for each step of the discrimination training.<br /> 3. Inspection of Figures 2C & 2D shows that responding in control animals is about the same at test 2 as at the end of extinction training. Therefore, could the authors provide evidence for spontaneous recovery in control animals? This is of importance given that the main conclusion of the authors is that LC stimulation during extinction training led to an increased expression of extinction memory as expressed by reduced spontaneous recovery.<br /> 4. Current evidence suggests that there are differences in LC/NA system functioning between males and females. Could the authors provide details about the allocation of male and female animals in each group?<br /> 5. The histology section in both experiments looks a bit unsatisfying. Could the authors provide more details about the number of counted cells and also their distribution along the antero-posterior extent of the LC. Could the authors also take into account the sex in such an analysis?

    4. Reviewer #3 (Public Review):

      The introduction/background is excellent. It reviews evidence showing that the extinction of conditioned responding is regulated by noradrenaline and suggests that the locus coeruleus (LC) may be a critical locus of this regulation. This naturally leads to the aim of the study: to determine whether the locus coeruleus is involved in the extinction of an appetitive conditioned response. Overall, the study is well-designed, nicely conducted and the results advance our understanding of the role of the LC in the extinction of conditioned behaviour. As such, I believe that these results will be of interest to readers. I do, however, feel that the paper would benefit from the inclusion of additional data to clarify the impact of the LC manipulations (stimulation and inhibition) on performance in the task; and some comment regarding the likely source of differences between the groups at test.

    1. eLife assessment

      This study aggregates across five fMRI datasets and reports that a network of brain areas previously associated with response inhibition processes, including several in the basal ganglia, are more active on failed stop than successful stop trials. This study is valuable as a well-powered investigation of fMRI measures of stopping. However, evidence for the authors' conclusions regarding the role of subcortical nodes in stopping is incomplete, due to the limitations of fMRI and a lack of theoretical synthesis.

    2. Reviewer #1 (Public Review):

      This is my first review of the article entitled "The canonical stopping network: Revisiting the role of the subcortex in response inhibition" by Isherwood and colleagues. This study is one in a series of excellent papers by the Forstmann group focusing on the ability of fMRI to reliably detect activity in small subcortical nuclei - in this case, specifically those purportedly involved in the hyper- and indirect inhibitory basal ganglia pathways. I have been very fond of this work for a long time, beginning with the demonstration of De Hollander, Forstmann et al. (HBM 2017) of the fact that 3T fMRI imaging (as well as many 7T imaging sequences) do not afford sufficient signal to noise ratio to reliably image these small subcortical nuclei. This work has done a lot to reshape my view of seminal past studies of subcortical activity during inhibitory control, including some that have several thousand citations.

      In the current study, the authors compiled five datasets that aimed to investigate neural activity associated with stopping an already initiated action, as operationalized in the classic stop-signal paradigm. Three of these datasets are taken from their own 7T investigations, and two are datasets from the Poldrack group, which used 3T fMRI.

      The authors make six chief points:<br /> 1. There does not seem to be a measurable BOLD response in the purportedly critical subcortical areas in contrasts of successful stopping (SS) vs. going (GO), neither across datasets nor within each individual dataset. This includes the STN but also any other areas of the indirect and hyperdirect pathways.<br /> 2. The failed-stop (FS) vs. GO contrast is the only contrast showing substantial differences in those nodes.<br /> 3. The positive findings of STN (and other subcortical) activation during the SS vs. GO contrast could be due to the usage of inappropriate smoothing kernels.<br /> 4. The study demonstrates the utility of aggregating publicly available fMRI data from similar cognitive tasks.<br /> 5. From the abstract: "The findings challenge previous functional magnetic resonance (fMRI) of the stop-signal task"<br /> 6. and further: "suggest the need to ascribe a separate function to these networks."

      I strongly and emphatically agree with points 1-5. However, I vehemently disagree with point 6, which appears to be the main thrust of the current paper, based on the discussion, abstract, and - not least - the title.

      To me, this paper essentially shows that fMRI is ill-suited to study the subcortex in the specific context of the stop-signal task. That is not just because of the issues of subcortical small-volume SNR (the main topic of this and related works by this outstanding group), but also because of its limited temporal resolution (which is unacknowledged, but especially impactful in the context of the stop-signal task). I'll expand on what I mean in the following.

      First, the authors are underrepresenting the non-fMRI evidence in favor of the involvement of the subthalamic nucleus (STN) and the basal ganglia more generally in stopping actions.<br /> - There are many more intracranial local field potential recording studies that show increased STN LFP (or even single-unit) activity in the SS vs. FS and SS vs. GO contrast than listed, which come from at least seven different labs. Here's a (likely non-exhaustive) list of studies that come to mind:<br /> o Ray et al., NeuroImage 2012<br /> o Alegre et al., Experimental Brain Research 2013<br /> o Benis et al., NeuroImage 2014<br /> o Wessel et al., Movement Disorders 2016<br /> o Benis et al., Cortex 2016<br /> o Fischer et al., eLife 2017<br /> o Ghahremani et al., Brain and Language 2018<br /> o Chen et al., Neuron 2020<br /> o Mosher et al., Neuron 2021<br /> o Diesburg et al., eLife 2021<br /> - Similarly, there is much more evidence than cited that causally influencing STN via deep-brain stimulation also influences action-stopping. Again, the following list is probably incomplete:<br /> o Van den Wildenberg et al., JoCN 2006<br /> o Ray et al., Neuropsychologia 2009<br /> o Hershey et al., Brain 2010<br /> o Swann et al., JNeuro 2011<br /> o Mirabella et al., Cerebral Cortex 2012<br /> o Obeso et al., Exp. Brain Res. 2013<br /> o Georgiev et al., Exp Br Res 2016<br /> o Lofredi et al., Brain 2021<br /> o van den Wildenberg et al, Behav Brain Res 2021<br /> o Wessel et al., Current Biology 2022<br /> - Moreover, evidence from non-human animals similarly suggests critical STN involvement in action stopping, e.g.:<br /> o Eagle et al., Cerebral Cortex 2008<br /> o Schmidt et al., Nature Neuroscience 2013<br /> o Fife et al., eLife 2017<br /> o Anderson et al., Brain Res 2020

      Together, studies like these provide either causal evidence for STN involvement via direct electrical stimulation of the nucleus or provide direct recordings of its local field potential activity during stopping. This is not to mention the extensive evidence for the involvement of the STN - and the indirect and hyperdirect pathways in general - in motor inhibition more broadly, perhaps best illustrated by their damage leading to (hemi)ballism.

      Hence, I cannot agree with the idea that the current set of findings "suggest the need to ascribe a separate function to these networks", as suggested in the abstract and further explicated in the discussion of the current paper. For this to be the case, we would need to disregard more than a decade's worth of direct recording studies of the STN in favor of a remote measurement of the BOLD response using (provably) sub ideal imaging parameters. There are myriads of explanations of why fMRI may not be able to reveal a potential ground-truth difference in STN activity between the SS and FS/GO conditions, beginning with the simple proposition that it may not afford sufficient SNR, or that perhaps subcortical BOLD is not tightly related to the type of neurophysiological activity that distinguishes these conditions (in the purported case of the stop-signal task, specifically the beta band). But essentially, this paper shows that a specific lens into subcortical activity is likely broken, but then also suggests dismissing existing evidence from superior lenses in favor of the findings from the 'broken' lens. That doesn't make much sense to me.

      Second, there is actually another substantial reason why fMRI may indeed be unsuitable to study STN activity, specifically in the stop-signal paradigm: its limited time resolution. The sequence of subcortical processes on each specific trial type in the stop-signal task is purportedly as follows: at baseline, the basal ganglia exert inhibition on the motor system. During motor initiation, this inhibition is lifted via direct pathway innervation. This is when the three trial types start diverging. When actions then have to be rapidly cancelled (SS and FS), cortical regions signal to STN via the hyperdirect pathway that inhibition has to be rapidly reinstated (see Chen, Starr et al., Neuron 2020 for direct evidence for such a monosynaptic hyperdirect pathway, the speed of which directly predicts SSRT). Hence, inhibition is reinstated (too late in the case of FS trials, but early enough in SS trials, see recordings from the BG in Schmidt, Berke et al., Nature Neuroscience 2013; and Diesburg, Wessel et al., eLife 2021).<br /> Hence, according to this prevailing model, all three trial types involve a sequence of STN activation (initial inhibition), STN deactivation (disinhibition during GO), and STN reactivation (reinstantiation of inhibition during the response via the hyperdirect pathway on SS/FS trials, reinstantiation of inhibition via the indirect pathway after the response on GO trials). What distinguishes the trial types during this period is chiefly the relative timing of the inhibitory process (earliest on SS trials, slightly later on FS trials, latest on GO trials). However, these temporal differences play out on a level of hundreds of milliseconds, and in all three cases, processing concludes well under a second overall. To fMRI, given its limited time resolution, these activations are bound to look quite similar.

      Lastly, further building on this logic, it's not surprising that FS trials yield increased activity compared to SS and GO trials. That's because FS trials are errors, which are known to activate the STN (Cavanagh et al., JoCN 2014; Siegert et al. Cortex 2014) and afford additional inhibition of the motor system after their occurrence (Guan et al., JNeuro 2022). Again, fMRI will likely conflate this activity with the abovementioned sequence, resulting in a summation of activity and the highest level of BOLD for FS trials.

      In sum, I believe this study has a lot of merit in demonstrating that fMRI is ill-suited to study the subcortex during the SST, but I cannot agree that it warrants any reappreciation of the subcortex's role in stopping, which are not chiefly based on fMRI evidence.

      A few other points:<br /> - As I said before, this team's previous work has done a lot to convince me that 3T fMRI is unsuitable to study the STN. As such, it would have been nice to see a combination of the subsamples of the study that DID use imaging protocols and field strengths suitable to actually study this node. This is especially true since the second 3T sample (and arguably, the Isherwood_7T sample) does not afford a lot of trials per subject, to begin with.<br /> - What was the GLM analysis time-locked to on SS and FS trials? The stop-signal or the GO-signal?<br /> - Why was SSRT calculated using the outdated mean method?<br /> - The authors chose 3.1 as a z-score to "ensure conservatism", but since they are essentially trying to prove the null hypothesis that there is no increased STN activity on SS trials, I would suggest erring on the side of a more lenient threshold to avoid type-2 error.<br /> - The authors state that "The results presented here add to a growing literature exposing inconsistencies in our understanding of the networks underlying successful response inhibition". It would be helpful if the authors cited these studies and what those inconsistencies are.

    3. Reviewer #2 (Public Review):

      This work aggregates data across 5 openly available stopping studies (3 at 7 tesla and 2 at 3 tesla) to evaluate activity patterns across the common contrasts of Failed Stop (FS) > Go, FS > stop success (SS), and SS > Go. Previous work has implicated a set of regions that tend to be positively active in one or more of these contrasts, including the bilateral inferior frontal gyrus, preSMA, and multiple basal ganglia structures. However, the authors argue that upon closer examination, many previous papers have not found subcortical structures to be more active on SS than FS trials, bringing into question whether they play an essential role in (successful) inhibition. In order to evaluate this with more data and power, the authors aggregate across five datasets and find many areas that are *more* active for FS than SS, specifically bilateral preSMA, caudate, GPE, thalamus, and VTA, and unilateral M1, GPi, putamen, SN, and STN. They argue that this brings into question the role of these areas in inhibition, based upon the assumption that areas involved in inhibition should be more active on successful stop than failed stop trials, not the opposite as they observed.

      As an empirical result, I believe that the results are robust, but this work does not attempt a new theoretical synthesis of the neuro-cognitive mechanisms of stopping. Specifically, if these many areas are more active on failed stop than successful stop trials, and (at least some of) these areas are situated in pathways that are traditionally assumed to instantiate response inhibition like the hyperdirect pathway, then what function are these areas/pathways involved in? I believe that this work would make a larger impact if the author endeavored to synthesize these results into some kind of theoretical framework for how stopping is instantiated in the brain, even if that framework may be preliminary.

      I also have one main concern about the analysis. The authors use the mean method for computing SSRT, but this has been shown to be more susceptible to distortion from RT slowing (Verbruggen, Chambers & Logan, 2013 Psych Sci), and goes against the consensus recommendation of using the integration with replacement method (Verbruggen et al., 2019). Therefore, I would strongly recommend replacing all mean SSRT estimates with estimates using the integration with replacement method.

    1. eLife assessment

      Based on a technological advance which couples onboard calcium imaging with in vivo electrophysiology in freely behaving mice, this work presents important insights about the brain circuits through which the cerebellum could participate to social interactions. In particular, correlative measurements provide interesting but incomplete evidence that connections between cerebellum and cingulate cortex connections specifically contribute to the complex sensory-motor computations underlying social contacts. This study is of interest for a broad range of neurophysiologists.

    2. Reviewer #1 (Public Review):

      In this manuscript, the authors describe an improved miniscope they name "E-scope", combining in vivo calcium imaging with electrophysiological recording. They use it to examine neural correlates of social interactions with respect to cerebellar and cortical circuits. Through correlations between electrophysiological single units of Purkinje cells and dentate nucleus neurons as well as with calcium signals imaging of neurons from the anterior cingulate cortex, the authors provide correlative data supporting the view that intracerebellar circuits and cerebello-cortical communications take part in the modulation of social behavior. In particular, the electrophysiological dataset reflects the PC-DN connection and strongly suggests its involvement in social interactions. Cross-correlations analyses between PC / DN single units and ACC calcium signals suggest that the recorded cerebellar and cortical structures both take part in the brain networks at play in social behavior.

      Strengths:<br /> - This is a timely and important study with solid evidence for correlative conclusions that are not overstated in the manuscript, which is commendable.<br /> - Despite the technical challenge, the experiments presented in this study seem well performed and the quality of the dataset is appropriate.

      Weaknesses:<br /> - While the novelty of the device is strongly emphasized, I find that its value is somewhat diminished by the wire-free device developed by the same group as it should thus be possible to perform calcium imaging wire-free and electrophysiological recording via a single conventional cable (or also via wireless headstages).<br /> - The role of the identified network activations in social interactions is not touched upon.

    3. Reviewer #2 (Public Review):

      This report by Hur et al. examines simultaneous activity in the cerebellum and anterior cingulate cortex (ACC) to determine how activity in these regions is coordinated during social behavior. To accomplish this, the authors developed a recording device named the E-scope, which combines a head-mounted mini-scope for in vivo Ca2+ imaging with an extracellular recording probe (in the manuscript they use a 32-channel silicon probe). Using the E-scope, the authors find subpopulations of cerebellar neurons with social-interaction-related activity changes. The activity pattern is predominantly decreased firing in PCs and increases in DNs, which is the expected reciprocal relationship between these populations. They also find social-interaction-related activity in the ACC. The authors nicely show the absence of locomotion onset and offset activity in PCs and DNs ruling out that is movement driven. Analysis showed high correlations between cerebellar and ACC populations (namely, Soc+ACC and Soc+DN cells). The finding of correlated activity is interesting because non-motor functions of the cerebellum are relatively little explored. However, the causal relationship is far from established with the methods used, leaving it unclear if these two brain regions are similarly engaged by the behavior or if they form a pathway/loop. Overall, the data are presented clearly, and the manuscript is well written, however, the biological insight gained is rather limited.

    4. Reviewer #3 (Public Review):

      Complex behavior requires complex neural control involving multiple brain regions. The currently available tools to measure neural activity in multiple brain regions in small animals are limited and often involve obligatory head-fixation. The latter, obviously, impacts the behaviors under study. Hur and colleagues present a novel recording device, the E-Scope, that combines optical imaging of fluorescent calcium imaging in one brain region with high-density electrodes in another. Importantly, the E-Scope can be implanted and is, therefore, compatible with usage in freely moving mice. The authors used their new E-Scope to study neural activity during social interactions in mice. They demonstrate the presence of neural correlates of social interaction that happen simultaneously in the cerebellum and the anterior cingulate cortex.

      The major accomplishment of this study is the development and introduction of the E-Scope. The evaluation of this part can be short: it works, so the authors succeeded.

      The authors managed to reduce the weight of the implant to 4.5 g, which is - given all functionality - quite an accomplishment in my view. However, a mouse weighs between 20 and 40 g, so that an implant of 4.5 g is still quite considerable. It can be expected that this has an impact on the behavior and, possibly, the well-being of the animals. Whether this is the case or not, is not really addressed in this study. The authors suffice with the statement that "Recorded animals made more contact with the other mouse than with the object (Figure 2A), suggesting a normal preference for social contact with the E-Scope attached."

      Overall, the description of animal behavior is rather sparse. The methods state only that stranger age-matched mice were used, but do not state their gender. The nature of the social interactions was not described? Was their aggressive behavior, sexual approach and/or intercourse? Did the stranger mice attack/damage the E-Scope? Were the interactions comparable (using which parameters?) with and without E-Scope attached? It is not even described what the authors define as an "interaction bout" (Figure 2A). The number of interaction bouts is counted per 7 minutes, I presume? This is not specified explicitly.

      In Figure 1 D-G, the authors present raw data from the neurophysiological recordings. In panel D, we see events with vastly different amplitudes. It would be very insightful if the authors would describe which events they considered to be action potentials, and which not. Similarly, the raw traces of Figure 1E are declared to be single-unit recordings of Purkinje cells. Partially due to the small size of the traces (invisible in print and pixelated in the digital version), I have a hard time recognizing complex spikes and simple spikes in these traces. This is a bit worrisome, as the authors declare the typical duration of the pause in simple spike firing after a complex spike to be 20-100 ms. In my experience, such long pauses are rare in this region, and definitely not typical. In the right panel of Figure 1A, an example of a complex spike-induced pause is shown. This pause is around 15 ms, so not typical according to the text, and starts only around 4 ms after the complex spike, which should not be the case and suggests either a misalignment of the figure or the detection of complex spike spikelets as simple spikes, while the abnormally long pause suggests that the authors fail to detect a lot of simple spikes. The authors could provide more confidence in their data by including more raw data, making explicit how they analyzed the signals, and by reporting basic statistics of firing properties (like rate, cv or cv2, pause duration). In this respect, Figure 2 - figure supplement 3 shows quite a large percentage of cells to have either a very low or a very high firing rate.

      The number of Purkinje cells recorded during social interactions is quite low: only 11 cells showed a modulation in their spiking activity (unclear whether in complex spikes, simple spikes or both. During object interaction, only 4 cells showed a significant modulation. Unclear is whether the latter 4 are a subset of the former 11, or whether "social cells" and "object cells" are different categories. Having so few cells, and with these having different types of modulation, the group of cells for each type of modulation is really small, going down to 2 cells/group. It is doubtful whether meaningful interpretation is possible here.

      This brings us to the next point: neural correlates of social interaction are notoriously difficult to interpret. Social behavior is complex, and involves the processing of sensory cues (olfaction, touch (whiskers), visual and auditory), the production of ultrasonic vocalizations (in specific contexts), movements, and emotional behavior (fear, pleasure, sexual interest). In other words, neural activity patterns observed during social interaction do not necessarily relate specifically to social interaction, but can also occur in a non-social context. The authors control this by comparing social interactions with object interactions, but I miss a direct comparison between the two conditions, both in terms of behavior (now only the number of interactions is counted, not their duration or intensity), and in terms of neural activity. There is some analysis done on the interaction between movement and cerebellar activity (Figure 2 - figure supplement 4), but it is unclear to what extent social interactions and movements are separated here. It would already help to indicate in the plots with trajectories (e.g., Fig. 2H) indicate the social interactions (e.g., social interaction-related movements in red, the rest of the trajectories in black).

      The neuron count in the anterior cingulate cortex is much higher than for the cerebellum, but also here it is not so clear what is "social" and what is "non-social". In Figure 3G-H, the authors indicate a near-perfect separation between cells active during social encounters and those active during object encounters. This could indicate that there is here indeed a social aspect, but as we do not know to what extent the sensory and motor aspects differ between social and non-social interactions, this is still hard to interpret.

      Finally, the authors show that there are correlations between the modulation in neurons of the anterior cingulate cortex and cerebellar neurons related to bouts of social activity. Here, it could be interesting to see whether there are differences in latency between the two brain areas.

      In conclusion, the authors present a novel method to record neural activity with single cell-resolution in two brain regions in freely moving mice. Given the challenges associated with understanding of complex behaviors, this approach can be useful for many neuroscientists. The authors demonstrate the potential of their approach by studying social interactions in mice. Clearly, there are correlations in the activity of neurons in the anterior cingulate cortex and the cerebellum related to social interactions. To bring our understanding of these patterns to a higher level, more detailed analyses (and probably also larger group sizes of cerebellar neurons) are required, though.

    1. eLife assessment

      This important study reveals the use of an allocentric spatial reference frame in how the perception of the location of a dimly lit target is updated during locomotion. The evidence supporting this claim is convincing, based on a series of cleverly and carefully designed behavioral experiments. The results will be of interest not only to scientists who study perception, action and cognition, but also to engineers who work on developing visually guided robots and self-driving vehicles.

    2. Reviewer #1 (Public Review):

      This study conducted a series of experiments to comprehensively support the allocentric rather than egocentric visual spatial reference updating for the path-integration mechanism in the control of target-oriented locomotion. Authors firstly manipulated the waiting time before walking to tease apart the influence from spatial working memory in guiding locomotion. They demonstrated that the intrinsic bias in perceiving distance remained constant during walking and that the establishment of a new spatial layout in the brain took a relatively longer time beyond the visual-spatial working memory. In the following experiments, the authors then uncovered that the strength of the intrinsic bias in distance perception along the horizontal direction is reduced when participants' attention is distracted, implying that world-centered path integration requires attentional effort. This study also revealed horizontal-vertical asymmetry in a spatial coding scheme that bears a resemblance to the locomotion control in other animal species such as desert ants.

      The overall design of the behavioral experiments is elegant and statistics are well performed to support the authors' viewpoint in the allocentric rather than egocentric visual spatial coding scheme for distance perception along the horizontal line.

      It is however worth noting the statement from Gibson in 1979 that for egocentric distances, tangible information arises from the effort required to walk a distance, thus, effort becomes associated through experience with visual distance cues. Accordingly, visual information alone is insufficient to support the awareness of distance. Perceived distance is rather specified by an invariant relationship between distal extent and a persons' potential to perform gross motion actions such as walking. This view is supported later by Proffitt et al. (2003) in which participants wore backpacks and their perceived distance increased compared with the baseline condition. Authors need to acknowledge the physical effort in addition to visual information for the spatial coding and may consider the manipulation of physical efforts in the future to support the robustness of constant intrinsic bias in ground-based spatial coding during walking.

      Furthermore, it would be more comprehensive and fit into the Neuroscience Section if the authors can add in current understandings of the spatial reference frames in neuroscience in the introduction and discussion, and provide explanations on how the findings of this study supplement the physiological evidence that supports our spatial perception as well. For instance, world-centered representations of the environment, or cognitive maps, are associated with hippocampal formation while self-centered spatial relationships, or image spaces, are associated with the parietal cortex (see Bottini, R., & Doeller, C. F. (2020). Knowledge Across Reference Frames: Cognitive Maps and Image Spaces. Trends in Cognitive Sciences, 24(8), 606-619. https://doi.org/10.1016/j.tics.2020.05.008 for details)

    3. Reviewer #2 (Public Review):

      The study provides a valuable contribution by demonstrating the use of an allocentric spatial reference frame in the perception of the location of a dimly lit target in the dark. While the evidence presented in support of the authors' claims is solid and convincing, it would be beneficial for the study to address potential limitations, such as its ecological validity.

      Strengths:<br /> Unlike previous research where observers were stationary during a visual-spatial perception task, this recent study expanded upon prior findings by incorporating bodily movements for the observers. This study is a valuable addition to the literature as it not only discovered that the intrinsic bias is grounded on the home base, but also identified several key characteristics through a series of follow-up experiments. The findings suggest that this "allocentric" spatial coding decays over time, requires attentional resources, can be based solely on vestibular signals, and is most effective in the horizontal direction. In general, this study is interesting, clearly presented, well-thought-out and executed. The results confirmed the conclusions and the study's comprehensive approach offers valuable insights into the nature of intrinsic bias in spatial perception.

      The counter-intuitive results presented in the manuscript are intriguing and add to the study's overall appeal. Moreover, the manuscript draws an interesting parallel between human spatial navigation and that of desert ants. This comparison helps to underscore the importance of understanding spatial coding mechanisms across different species and highlights potential avenues for future research.

      One aspect I particularly valued about this study was the authors' thorough description of the experimental methods. This level of detail not only highlights the rigor of the research but also enhances the reproducibility of the study, making it more accessible for future researchers.

      Weaknesses:<br /> While the current study provides valuable insights into the nature of intrinsic bias in spatial perception, there is a concern regarding its ecological validity. The experimental design involved stringent precautions, such as a very dark room and a small target, to minimize the presence of depth cues. This is in contrast to the real world, where depth information is readily available from the ground and surrounding objects, aiding in our perception of space and depth. As a result, it is unclear to what extent this "allocentric" intrinsic bias is involved in our everyday spatial perception. To provide more context for the general audience, it would be beneficial for the authors to address this issue in their discussion.

      The current findings on the "allocentric" coding scheme raise some intriguing questions as to why such a mechanism would be developed and how it could be beneficial. The finding that the "allocentric" coding scheme results in less accurate object localization and requires attentional resources seems counterintuitive and raises questions about its usefulness. However, this observation presents an opportunity for the manuscript to discuss the potential evolutionary advantages or trade-offs associated with this coding mechanism.

      The manuscript lacks a thorough description of the data analysis process, particularly regarding the fitting of the intrinsic bias curve (e.g., the blue and gray dashed curve in Figure 3c) and the calculation of the horizontal separation between the curves. It would be beneficial for the authors to provide more detailed information on the specific function and parameters used in the fitting process and the formula used for the separation calculation to ensure the transparency and reproducibility of the study's results.

    4. Reviewer #3 (Public Review):

      This study investigated what kind of reference (allocentric or egocentric) frame we used for perception in darkness. This question is essential and was not addressed much before. The authors compared the perception in the walking condition with that in the stationary condition, which successfully separated the contribution of self-movement to the spatial representation. In addition, the authors also carefully manipulated the contribution of the waiting period, attentional load, vestibular input, testing task, and walking direction (forward or backward) to examine the nature of the reference frame in darkness systematically.

      I am a bit confused by Figure 2b. Allocentric coordinate refers to the representation of the distance and direction of an object relative to other objects but not relative to the observer. In Figure 2, however, the authors assumed that the perceived target was located on the interception between the intrinsic bias curve and the viewing line from the NEW eye position to the target. This suggests that the perceived object depends on the observer's new location, which seems odd with the allocentric coordinate hypothesis.

      According to Fig 2b, the perceived size should be left-shifted and lifted up in the walking condition compared to that in the stationary condition. However, in Figure 3C and Fig 4, the perceived size was the same height as that in the baseline condition.

      Is the left-shifted perceived distance possibly reflecting a kind of compensation mechanism? Participants could not see the target's location but knew they had moved forward. Therefore, their brain automatically compensates for this self-movement when judging the location of a target. This would perfectly predict the left-shifted but not upward-shifted data in Fig 3C. A similar compensation mechanism exists for size constancy in which we tend to compensate for distance in computing object size.

      According to Fig 2a, the target, perceived target, and eye should be aligned in one straight line. This means that connecting the physical targets and the corresponding perceived target results in straight lines that converge at the eye position. This seems, however, unlikely in Figure 3c.

    1. eLife assessment

      The study provides valuable insights into allosteric regulation of BTK, a non-receptor protein kinase, challenging previous models. Using a variety of biophysical and functional techniques, the paper presents evidence that the N-terminal PH-TH domain of BTK exists in a conformational ensemble surrounding a compact SH3-SH2-kinase core, that the BTK kinase domain can form partially active dimers, and that the PH domain can form a novel inhibitory interface after SH2/SH3 disengagement. Overall the presented evidence is solid, but the EM results may be over-interpreted and the work would benefit from additional functional validation.

    2. Reviewer #1 (Public Review):

      The manuscript by Lin et al describes a wide biophysical survey of the molecular mechanisms underlying full-length BTK regulation. This is a continuation of this lab's excellent work on deciphering the myriad levels of regulation of BTKs downstream of their activation by plasma membrane localised receptors.

      The manuscript uses a synergy of cryo EM, HDX-MS and mutational analysis to delve into the role of how the accessory domains modify the activity of the kinase domain. The manuscript essentially has three main novel insights into BTK regulation.

      1. Cryo EM and SAXS show that the PHTH region is dynamic compared to the conserved Src module.<br /> 2. A 2nd generation tethered PH-kinase construct crystal of BTK reveals a unique orientation of the PH domain relative to the kinase domain, that is different from previous structures.<br /> 3. A new structure of the kinase domain dimer shows how trans-phosphorylation can be achieved.

      Excitingly these structural works allow for the generation of a model of how BTK can act as a strict coincidence sensor for both activated BCR complex as well as PIP3 before it obtains full activity. To my eye the most exciting result of this work is describing how the PH domain can inhibit activity once the SH3/SH2 domain is disengaged, allowing for an additional level of regulatory control.

      I have very few experimental concerns as the methods and figures are well-described and clear. As the authors are potentially saying that the previously solved PH domain-kinase interface is artefactual, additional evidence strengthening their model would be helpful to resolve any possible controversies.

    3. Reviewer #2 (Public Review):

      In this study, multiple biophysical techniques were employed to investigate the activation mechanism of BTK, a multi-domain non-receptor protein kinase. Previous studies have elucidated the inhibitory effects of the SH3 and SH2 domains on the kinase and the potential activation mechanism involving the membrane-bound PIP3 inducing transient dimerization of the PH-TH domain, which binds to lipids.

      The primary focus of the present study was on three new constructs: a full-length BTK construct, a construct where the PH-TH domain is connected to the kinase domain, and a construct featuring a kinase domain with a phosphomimetic at the autophosphorylation site Y551. The authors aimed to provide new insights into the autoinhibition and allosteric control of BTK.

      The study reports that SAXS analysis of the full-length BTK protein construct, along with cryoEM visualization of the PH-TH domain, supports a model in which the N-terminal PH-TH domain exists in a conformational ensemble surrounding a compact/autoinhibited SH3-SH2-kinase core. This finding is interesting because it contradicts previous models proposing that each globular domain is tightly packed within the core.

      Furthermore, the authors present a model for an inhibitory interaction between the N-lobe of the kinase and the PH-TH domain. This model is based on a study using a tethered complex with a longer tether than a previously reported construct where the PH-TH domain was tightly attached to the kinase domain (ref 5). The authors argue that the new structure is relevant. However, this assertion requires further explanation and discussion, particularly considering that the functional assays used to assess the impact of mutating residues within the PH-TH/kinase domain contradict the results of the previous study (ref 5).

      Additionally, the study presents the structure of the kinase domain with swapped activation loops in a dimeric form, representing a previously unseen structure along the trans-phosphorylation pathway. This structure holds potential relevance. To better understand its significance, employing a structure/function approach like the one described for the PH-TH/kinase domain interface would be beneficial.

      Overall, this study contributes to our understanding of the activation mechanism of BTK and sheds light on the autoinhibition and allosteric control of this protein kinase. It presents new structural insights and proposes novel models that challenge previous understandings. However, further investigation and discussion would significantly strengthen the study.

    4. Reviewer #3 (Public Review):

      Yin-wei Lin et al set out to visualize the inactive conformation of full-length Bruton's Tyrosine Kinase (BTK), a molecule that has evaded high-resolution structural studies in its full-length form to this date. An open question in the field is how the Pleckstrin Homology-Tec Homology (PHTH) domain inhibits BTK activity, with multiple competing models in the field. The authors used a complimentary set of biophysical techniques combined with well-thought-out stabilizing mutations to obtain structural insights into BTK regulation in its full-length form. They were able to crystallize the full-length construct of BTK but unfortunately, the PHTH was not resolved yielding a structure similar to that previously obtained in the field. The investigation of the same construct by SAXS yielded an elongated structural model, consistent with previous SAXS studies. Using cryo-EM the authors obtained a low-resolution model for the FL BTK with a loosely connected density assigned to the dynamic PHTH around the compact SH2-SH3-Kinase Domain (KD) core. To gain further molecular insights into PHTH-KD interactions the authors followed a previously reported strategy and generated a fusion of PHTH-KD with a longer linker, yielding a crystal structure with a novel PHTH-KD interface which they tested in biochemical assays. Lastly, Yin-wei Lin et al crystallized the BTK KD in a novel partially active state in a "face-to-face" dimer with kinases exchanging the activation loops, although partially disordered, being theoretically perfectly positioned for transphosphorylation. Overall this presents a valiant effort to gain molecular insights into what clearly is a dynamic regulatory motif on BTK and is a valuable addition to the field.

      However, this work can be improved by considering these points:

      1) The cryo-EM reconstructions are potentially over-interpreted. The reported resolution for all of the analyzed reconstructions is better than 8Å, at which point helices should be recognized as well-resolved structural elements. In the current view/depiction of the cryo-EM maps/models it is hard to see such structural features and it would be great if the authors could include a panel showing maps at higher thresholds to show correspondence between the helices in the kinase C lobe and the cryo-EM maps. Otherwise, the overall positioning of the models within the cryo-EM maps is hard to evaluate and may very well be wrong. (Fig 4, S2).

      2) With the above in mind, if the maps are not at the point where helices are well resolved, it may be beneficial to low-pass filter the maps to a more conservative resolution for fitting, analysis, and representation. (Fig 4, S2).

      3) It would be valuable to get a quantitative metric on the model/map fitting for the cryo-EM work. One good package for this is Situs which provides cross-correlation values for the top orthogonal fits, without user input for initial fitting. This would again increase the confidence in the correctness of model positioning on the map. (Fig 4, S2).

      4) It would be great to see 2D class averages from the particles contributing to each of the 3D classes. Theoretically, a clear bright "blob" (hypothesized to be the PHTH domain) should be observable in the 2D class averages. In the current 2D class averages that region is unconvincingly weak. (Fig 4, S2).

      5) It seems like there was quite a large circular mask applied during 2D classification. Are authors confident that the weak density attributed to the PHTH domain is not neighboring particles making their way into the extraction box? It would be great if the authors would trim their particle stack with a very stringent inter-particle distance cutoff (or report the cutoff in the manuscript if already done so) to minimize this possibility.

      6) The cryo-EM processing may benefit from more stringent particle picking. The authors picked over 2M particles from 750 micrographs which likely represents very heavy overpicking. I would encourage the authors to re-pick the micrographs with 2D class averages and use more stringent metrics to reduce the overpicking. This may result in higher-resolution reconstructions. (Fig 4, S2).

      7) The Dmax from SAXS for the Full Length BTK is at 190Å. It would be great if the authors could make a cartoon of what domain arrangement may satisfy this distance, as it is quite extended for such a small particle. Can the authors rule out dimerization at SAXS concentrations? (Fig 1).

      8) In Figure S1 (C) it seems that the curves are just scattering curves with Guinier plots in the inserts, but are labeled as Guinier plots in the legend. The Guinier plots for some samples (FL 4P1F) show signs of aggregation, which may complicate the analysis, it could be beneficial to redo.

      9) Have the authors verified that the activation loop mutations that they introduce do not disrupt the PHTH binding as they previously reported an activation loop on BTK to interact with PHTH, an interaction they do not see here? If so, a citation would be helpful in the text. If not, testing this would strengthen the paper.

      10) Can the authors comment on the surfaces which are accessible and inaccessible to the PHTH in the crystal (Fig 3E)? The fact that PHTH doesn't adopt a stable conformation in the solvent channel to some degree indicates that the accessible interaction surfaces are not suitable for PHTH interactions, as the "effective concentration" of the PHTH would be quite high. Are these surfaces consistent with the cryo-EM analysis?

      11) For the novel active state dimer of the Kinase Domain it would be great to see some functional validation of the dimerization interface. It is structurally certainly quite suggestive, but without such experiments the functional significance is unclear. If appropriate mutations have been published previously a citation would be helpful.

    1. eLife assessment

      The manuscript describes a valuable theoretical calculation focusing on the structural changes in the photosynthetic reaction center postulated by others based on time-resolved crystallography using X-ray free-electron laser (XFEL) (Dods et al., Nature, 2021). The authors argue that calculated changes in redox potential Em and deformations using the XEFL structures may reflect experimental errors rather than real structural changes. The study is still incomplete in the sense that it focuses on explaining why the proposed structural changes do not match the theoretical calculations, but it does not yet provide an alternative model.

    2. Reviewer #1 (Public Review):

      First, I agree with the authors of this manuscript that conformational changes in the XFEL structures with 2.8 A resolution are not reliable enough for demonstrating the subtle changes in the electron transfer events in this bacterial photosynthesis system. Actually, the data statistics in the paper by Dods et al. showed that the high-resolution range of some of the XFEL datasets may include pretty high noise (low CC1/2 and high Rsplit) so the comparison of the subtle conformational changes of the structures is problematic.

      The manuscript by Gai Nishikawa investigated time-dependent changes in the energetics of the electron transfer pathway based on the structures by Dods et al. by calculating redox potential of the active and inactive branches in the structures and found no clear link between the time-dependent structural changes and the electron transfer events in the XFEL structures published by Dods, R.et al. (2021). This study provided validation for the interpretation of the structures of those electron-transferring proteins.

      The paper was well prepared.

    3. Reviewer #2 (Public Review):

      The manuscript by Nishikawa et al. addresses time-dependent changes in the electron transfer energetics in the photosynthetic reaction center from Blastochloris viridis, whose time-dependent structural changes upon light illumination were recently demonstrated by time-resolved serial femtosecond crystallography (SFX) using X-ray free-electron laser (XFEL) (Dods et al., Nature, 2021). Based on the redox potential Em values of bacteriopheophytin in the electron transfer active branch (BL) by solving the linear Poisson-Boltzmann equation, the authors found that Em(HL) values in the charge-separated 5-ps structure obtained by XFEL are not clearly changed, suggesting that the P+HL- state is not stabilized owing to protein reorganization. Furthermore, chlorin ring deformation upon HL- formation, which was expected from their QM/MM calculation, is not recognized in the 5-ps XFEL structure. Then the authors concluded that the structural changes in the XFEL structures are not related to the actual time course of charge separation. They argued that their calculated changes in Em and chlorin ring deformations using the XEFL structures may reflect the experimental errors rather than the real structural changes; they mentioned this problem is due to the fact that the XFEL structures were obtained at not high resolutions (mostly at 2.8 Å). I consider that their systematic calculations may suggest a useful theoretical interpretation of the XFEL study. However, the present manuscript insists as a whole negatively that the experimental errors may hamper to provide the actual structural changes relevant to the electron transfer events. My concerns are the following two points:<br /> Is the premise of the authors for the electron transfer energetics obviously valid?<br /> Could the authors find any positive aspect(s) in the XFEL study?

      The authors' argument is certainly due to their premise "Em(HL) is expected to be exclusively higher in the 5-ps and 20-ps structures than in the other XFEL structures due to the stabilization of the [PLPM]•+HL•- state by protein reorganization" as noted in the Results and Discussion (p. 12, lines 180-182); however, it is unknown whether this premise can be applied to the ps-timescale electron transfer events. The above premise is surely based on the Marcus theory, as the authors also noted in the Introduction "The anionic state formation induces not only reorganization of the protein environment (ref. 5: Marcus and Sutin, 1985) but also out-of-plane distortion of the chlorin ring (ref. 6: two of the authors, Saito and Ishikita, co-authored, 2012)"; however, it is unknown whether protein reorganization can follow the ps-timescale electron transfer events. Indeed, Dods et al. mentioned in the Nature paper (2021) "The primary electron-transfer step from SP (special pair PLPM) to BPhL (HL) occurs in 2.8 {plus minus} 0.2 ps across a distance of 10 Å by means of a two-step hopping mechanism via the monomeric BChL molecule and is more rapid than conventional Marcus theory". It was also mentioned, "By contrast, the 9 Å electron-transfer step from BPhL to QA has a single exponential decay time of 230 {plus minus} 30 ps, which is consistent with conventional Marcus theory". As for the primary electron-transfer step from PLPM to HL, Wang et al. (2007, Science 316, 747; cited as ref. 8 in the Nature paper 2021) reported, by monitoring tryptophan absorbance changes in various reaction centers in which the driving forces (namely, the Em gaps between PLPM and HL) are different, that the protein relaxation kinetics is independent of the charge separation kinetics on the picosecond timescale. On the other hand, in the EPR study cited by the authors as ref. 7 (Muh et al. (1998) Biochemistry 37, 13066), although the authors described "two distinct conformations of HL- were reported in spectroscopic studies" (p. 3, lines 44-45), it should be noted that conformation of HL- was formed by 1 or 45 s illumination prior to freezing, and hence the second-order reorganized conformations may differ from picosecond-order conformations observed by the XFEL study (Nature, 2021) and/or the transient absorption spectroscopy (Science, 2007).

      Therefore, I consider there is a possibility that the authors' findings may reflect not experimental errors but the actual ps-timescale phenomena presented by the first-time XFEL study on the timescale of the primary charge-separation reactions of photosynthesis. Thus I would like to suggest that the authors reconsider the premise for the electron transfer energetics on the picosecond timescale.

      In any case, to discuss the experimental errors in the XFEL study, it is better to calculate the Em(QA) changes in the 300-ps and 8-us XFEL structures, which showed distinctive structural changes even at the 2.8 Å resolution as discussed by Dods et al. Then, if the Em(QA) values are changed as expected from theoretical calculations, such calculated results may suggest a useful theoretical interpretation of the XFEL study as a positive aspect. If the Em(QA) values are not higher in the 300-ps and 8-us structures than in the other structures, it may be argued that the experimental errors would be so large that the XFEL structures are irrelevant to the electron transfer events expected from theoretical calculations.

    1. eLife assessment

      This interesting study describes the development of a three-dimensional cell culture system to investigate muscle tissue development and homeostasis. It is a solid study that could be valuable in the study of human as opposed to animal cells in studying muscular disorders.

    2. Reviewer #1 (Public Review):

      The authors aimed to establish a cell culture system to investigate muscle tissue development and homeostasis. They successfully developed a complex 3D cell model and conducted a comprehensive molecular and functional characterization. This approach represents a critical initial step towards using human cells, rather than animals, to study muscular disorders in vitro. Although the current protocol is time-consuming and the fetal cell model may not be mature enough to study adult-onset diseases, it nonetheless provides a valuable foundation for future disease modelling studies using isogenic iPSC lines or patient-derived cells with specific mutations. The manuscript does not explore whether or how this stem cell model can advance our understanding of muscular diseases, which would be an exciting avenue for future research. Overall, the detailed protocol presented in this paper will be useful for informing future studies and provides an important resource to the stem cells community. The inclusion of data on disease modelling using isogenic iPSC lines or patient-derived cells would further enhance the manuscript's impact.

    3. Reviewer #2 (Public Review):

      This paper illustrates that PSCs can model myogenesis in vitro by mimicking the in vivo development of the somite and dermomyotome. The advantages of this 3D system include (1) better structural distinctions, (2) the persistence of progenitors, and (3) the spatial distribution (e.g. migration, confinement) of progenitors. The finding is important with the implication in disease modeling. Indeed the authors tried DMD model although it suffered the lack of deeper characterization.

      The differentiation protocol is based on a current understanding of myogenesis and compelling. They characterized the organoids in depth (e.g. many time points and immunofluorescence). The evidence is solid, and can be improved more by rigorous analyses and descriptions as described below.

      Major comments:

      1. Consistency between different cell lines.<br /> I see the authors used a few different PSC lines. Since organoid efficiency differ between lines, it is important to note the consistency between lines.

      2. Heterogeneity among each organoid<br /> Let's say authors get 10 organoids in one well. Are they similar to each other? Does each organoid possess similar composition of cells? To determine the heterogeneity, the authors could try either FACS or multiple sectioning of each organoid.

      3. Consistency of Ach current between organoids.<br /> Related to comment 2, are the currents consistent between each organoid? How many organoids were recorded in the figures? Also, please comment if the current differ between young and aged organoids.

      4. Communication between neural cells and muscle?<br /> The authors did scRNAseq, but have not gone deep analysis. I would recommend doing Receptor-ligand mapping and address if neural cells and muscle are interacting.

      5. More characterization of DMD organoids.<br /> One of the key applications of muscle organoids is disease model. They have generated DMD muscle organoids, but rarely characterized except for currents. I recommend conducting immunofluorescence of DMA organoids to confirm structure change. Very intriguing to see scRNAseq of DMD organoids and align with disease etiology.

      6. More characterization of engraft.<br /> Authors could measure the size of myotube between mice and human. Does PAX7+ Sattelite cell exist in engraft? To exclude cell fusion events make up the observation, I recommend to engraft in GFP+ immunodeficient mice. Could the authors comment how long engraft survive.

    1. Joint Public Review:

      Throughout the study, there is insufficient information about how experiments were performed and how often (imaging, pull-downs etc), how data was acquired, modified and analysed (especially imaging data, see below), how statistical analyses were done and what is presented in the figures (single planes or maximum intensity projections etc). This makes it difficult to evaluate the data and results.

      There is insufficient information about tools and reporters used. This is misleading and impacts the conclusions that can be made from the results presented. To give an example, in Figure 1D-F, the authors present data that HDA-1::GFP and LIN-53::mNeonGreen (both components of the nucleosome remodeling and deacetylation complex) but not the histone acetyltransferase MYS-1::GFP are 'asymmetrically segregated' during QR.a division. However, the authors do not mention that HDA-1::GFP and LIN-53::mNeonGreen are expressed at endogenous levels (they are CRISPR alleles) whereas MYS-1::GFP is overexpressed (integration of a multi-copy extrachromosomal array). The difference in 'segregation' could therefore be a consequence of different levels of expression rather than different modes of segregation ('asymmetric' versus 'symmetric').

      There is insufficient information about the phenotypes of the animals used (RNAi knock-downs of hda-1, lin-53 RNAi, pig-1 etc). Again this is misleading and impacts the conclusions that can be made. To give some examples, (1) in Figure 3A-G, control RNAi embryos are compared to hda-1 RNAi and lin-53 RNAi embryos. What the authors do not mention is that hda-1 RNAi and lin-53 RNAi embryos have severe developmental defects and essentially cannot be compared to control RNAi embryos. The differences between the embryos can be seen in Figure S7B where bright-field images of control RNAi, hda-1 RNAi and lin-53 RNAi embryos are shown. At the 350 min time point, a normal embryo is visible for the control, a 'ball of cells' embryo for hda-1 RNAi and an embryo that seems to have arrested at an earlier developmental stage (and therefore have much larger cells) for lin-53 RNAi. Because of these pleiotropic phenotypes, it is unclear whether differences seen for example in sAnxV::GFP positive cells (Figure 3A) are the result of a direct effect of hda-1(RNAi) on cell death or whether they are the result of global changes in development and cell fate induced by hda-1(RNAi). hda-1(RNAi) and lin-53(RNAi) embryos are also used for the data shown in Figures S6 and S7, raising the same concerns; (2) the authors do not mention what the impact of Baf A1 treatment is on animals; however, the images provided in Figure 5E indicate that Baf A1 treatment causes pleiotropic effects in L1 larvae.

      There is a lack of adequate controls. Because of this, some of the data presented must be considered as preliminary. To give some examples: (1) controls are lacking for the data shown in Figure 3D-G (i.e. genes other than egl-1). Since hda-1 RNAi has a pleiotropic effect and most likely affects H3K27 acetylation genome-wide, this is critical. Based on what is shown, it is unclear whether the results presented are specific to egl-1 or not; (2) the co-IP and mass spec data shown in Figure 4A, C and Figure S8 also lack a critical control, which is GFP only. Because of this, it is unclear whether subunits of the V-ATPase bind to HDA-1 or GFP. The co-IP and mass spec data forms the basis of Figures 5 and 6 as well as Figure S9. Data presented in these figures therefore has to be considered preliminary as well.

      Inappropriate methods are used. For this reason, some of the data again must be considered preliminary. To give some examples: (1) in Figure 5A, B, the authors used super-ecliptic pHluorin to look at changes in pH in the daughter cells. However, the authors used quenching of super-ecliptic pHluorin fluorescence rather than a ratio-metric method to 'measure' changes in pH. Because of this, it is unclear whether the changes in fluorescence observed are due to changes in pH or changes in the amount of pHluorin protein. Figure 5A, B forms the basis for the experiments presented in the remaining parts of Figure 5 as well as in Figure 6 and Figure S9; (2) the authors' description of how some images were modified before quantitative analysis raises concerns. The figures of concern are particularly Figure 1 and Figure S4, where background subtraction with denoising and deconvolution was used. Background subtraction, with denoising and deconvolution is an image manipulation that enhances the contrast between background and what looks like foreground. Therefore, background subtraction should be applied primarily in experiments involving image segmentation not fluorescence intensity measurement. Not being provided any information by the authors about the kind of subtraction that was made, this processing could lead to an uneven subtraction across the image, which can easily lead to artefacts. Since the fluorescence intensity in the smaller daughter cell is lower, and thus closer to background, the algorithm the authors used may have misinterpreted the grey value information in the smaller daughter cell pixels. This could have led to an asymmetric subtraction of background in the two daughter cells, leading to a stronger subtraction in the smaller daughter cell. Ultimately, their processing could have artificially increased the intensity asymmetry between the two daughter cells in all their results.

      The imaging data is of low quality (for example Figures 1, 2, 5, 6; Figures S2, S3, S5, S6, S9). Since much of the study and the findings are based on imaging, this is a major concern. Critical parameters are not mentioned (number of sections in z-stack, size of the field-of-view, laser power used etc), which makes it difficult to understand what was done and what one is looking at. To give some specific examples, (1) the images shown in Figure 2B are of very low quality with severe background from neighbouring cells. In addition, the outline of the cells (plasma membrane) or the nuclei of the daughter cells is unknown. Based on this it is not clear how the authors could have measured 'Fluorescence intensity ratio between sister nuclei' in an accurate and unbiased way (what is clear from these images is that there is an increase in HDA-1::GFP signal in ALL surviving daughters (asymmetric and symmetric divisions) post cytokinesis but not in the daughter cell that is about to die (asymmetric and unequal division); (2) the images in Figure 6A and Figure S9A on VHA-17 segregation and its colocalization to ER and lysosome segregation during QR.a division are of very low quality and it is unclear to the reviewer how such images were used to obtain the quantitative data shown.

      In some cases, there is a discrepancy between what is shown in figures and what the authors state in the text. To give some examples: (1) on page 7, the authors state "..., we found that nuclear HDA-1 or LIN-53 asymmetry gradually increased from 1.1-fold at the onset of anaphase to 1.5 or 1.8-fold at cytokinesis, respectively (Figure 1D-E)." Looking at the images for HDA-1 and LIN-53 in Figure 1D, the increase in the ratio mainly occurs between 4 min and 6 min, which is post cytokinesis and NOT prior to cytokinesis; (2) these images (Figure 1D) also show that there is an increase in the HDA-1 and LIN-53 signals in the larger daughter cells (QR.ap), which suggests that the increase in ratios (Figure 1E) is the result of increased HDA-1 and LIN-53 synthesis post cytokinesis. However, on top of page 8, the authors state "The total fluorescence of HDA-1, LIN-53 and MYS-1 remained constant during ACDs, suggesting that protein redistribution may establish NuRD asymmetry (Figure S4C)." In Figure S4C, the authors present straight lines for 'relative total fluorescence' for imaging (probably z-stacks) that was done every min over the course of 7 min. If there was no increase in material as the authors claim, they should have seen significant photobleaching over the course of the 7 min and therefore reduced level of 'relative total fluorescence' over time. How the data presented in Figure S4C was generated is therefore unclear. (Despite the fact that the authors claim that the asymmetry seen is not due to new synthesis in the larger daughter cell post cytokinesis, it would be more consistent with the first experiment presented in this study (Figure S1) that shows that there is more hda-1 mRNA in egl-1(-) cells compared to egl-1(+) cells); (3) On page 12, the authors state "..., in Baf A1-treated animals, QRaa inherited similar levels of HDA-1::GFP as its sister cell,...". However, looking at the image provided in Figure 5E (0 min), there seems to be a similar ratio of HDA-1::GFP between the daughter cells in DMSO and Baf A1-treated animals.

    1. eLife assessment

      This study presents useful findings regarding the impact of forest cover and fragmentation on the prevalence of malaria in non-human primates. The evidence supporting the claims of the authors is, however, incomplete, as the sampling design cannot adequately address the geospatial issues that this study focuses on.

    2. Reviewer #1 (Public Review):

      The study as a concept is well designed, although there are two issues I see in the methodology (these may be just needing further explanation or if I am correct in my interpretation of what was done, may need reanalysis to take into account). Both issues relate to the data that was extracted from the published literature on zoonotic malaria prevalence in the study area.

      1. No limit was set on the temporal range<br /> With no temporal limit on the range of studies, the landscape in many cases will have changes between the study being conducted and the spatial data. This will be particularly marked in areas where there has been clearing since the zoonotic malaria prevalence study. Also, population changes (either through population growth, decline or movement) will have occurred. All research is limited in what it can do with the available data, so I realise that there may not be much the authors can do to correct this. One possible solution would be to look at the land use change at each site between the prevalence study and the remote sensing data. I'm not sure if this is feasible, but if it is I would recommend the authors attempt this as it will make their results stronger.

      2. Most studies only gave a geographic area or descriptive location.<br /> The spatial analysis was based on a 5km and 20km radius of the 'study site' location, but for many of the studies the exact site is not known. Therefore the 'study site' was artificially generated using a polygon centroid. Considering that the polygon could be an administrative boundary (ie district/state/country), this is an extremely large area for which a 5km radius circle in the middle of the polygon is being taken as representative of the 'study site'. This doesn't make sense as it assumes that the landscape is uniform across the district, which in most cases it will not be (in rural areas it is going to be a mixture of villages, forest, plantation, crops etc which will vary across the landscape). This might just be a case of misunderstanding what was done (in which case the text needs rewording to make it clearer) or if I have interpreted it correctly the selection of the centroid to represent the study area does not make sense. I am not sure how to overcome this as it probably not possible to get exact locations for the study sites. One possibility could be to make the remote sensing data the same scale as the prevalence data ie if the study site is only identifiable at the polygon level, then the remote sensing data (fragmentation, cover and population) is used at the polygon level.

      Both these issues could have an impact on the study's findings. I would think that in both cases it might make the relationship between the environmental variables and prevalence even clearer.

    3. Reviewer #2 (Public Review):

      This is the first comprehensive study aimed at assessing the impact of landscape modification on the prevalence of P. knowlesi malaria in non-human primates in Southeast Asia. This is a very important and timely topic both in terms of developing a better understanding of zoonotic disease spillover and the impact of human modification of landscape on disease prevalence.

      This study uses the meta-analysis approach to incorporate the existing data sources into a new and completely independent study that answers novel research questions linked to geospatial data analysis. The challenge, however, is that neither the sampling design of previous studies nor their geospatial accuracy are intended for spatially-explicit assessments of landscape impact. On the one hand, the data collection scheme in existing studies was intentionally opportunistic and does not represent a full range of landscape conditions that would allow for inferring the linkages between landscape parameters and P. knowlesi prevalence in NHP across the region as a whole. On the other hand, the absolute majority of existing studies did not have locational precision in reporting results and thus sweeping assumptions about the landscape representation had to be made for the modeling experiment. Finally, the landscape characterization was oversimplified in this study, making it difficult to extract meaningful relationships between the NHP/human intersection on the landscape and the consequences for P. knowlesi malaria transmission and prevalence.

      Despite many study limitations, the authors point to the critical importance of understanding vector dynamics in fragmented forested landscapes as the likely primary driver in enhanced malaria transmission. This is an important conclusion particularly when taken together with the emerging evidence of substantially different mosquito biting behaviors than previously reported across various geographic regions.

      Another important component of this study is its recognition and focus on the value of geospatial analysis and the availability of geospatial data for understanding complex human/environment interactions to enable monitoring and forecasting potential for zoonotic disease spillover into human populations. More multi-disciplinary focus on disease modeling is of crucial importance for current and future goals of eliminating existing and preventing novel disease outbreaks.

    1. eLife assessment

      This study by Verdikt et al. provided solid evidence demonstrating the potential impacts of Δ9-tetrahydrocannabinol (Δ9-THC) on early embryonic development using mouse embryonic stem cells (mESCs) and in vitro differentiation. Their results revealed that Δ9-THC enhanced mESCs proliferation and metabolic adaptation, possibly persisting through differentiation to Primordial Germ Cell-Like Cells (PGCLCs), though the evidence supporting this persistence was incomplete. Although the study is important, it was limited by being conducted solely in vitro and lacking parallel human model experiments.

    2. Reviewer #1 (Public Review):

      The authors investigated the metabolic effects of ∆9-THC, the main psychoactive component of cannabis, on early mouse embryonic cell types. They found that ∆9-THC increases proliferation in female mouse embryonic stem cells (mESCs) and upregulates glycolysis. Additionally, primordial germ cell-like cells (PGCLCs) differentiated from ∆9-THC-exposed cells also show alterations to their metabolism. The study is valuable because it shows that physiologically relevant ∆9-THC concentrations have metabolic effects on cell types from the early embryo, which may cause developmental effects. However, the claim of "metabolic memory" is not justified by the current data, since the effects on PGCLCs could potentially be due to ∆9-THC persisting in the cultured cells over the course of the experiment, even after the growth medium without ∆9-THC was added.

      The study shows that ∆9-THC increases the proliferation rate of mESCs but not mEpiLCs, without substantially affecting cell viability, except at the highest dose of 100 µM which shows toxicity (Figure 1). Treatment of mESCs with rimonabant (a CB1 receptor antagonist) blocks the effect of 100 nM ∆9-THC on cell proliferation, showing that the proliferative effect is mediated by CB1 receptor signaling. Similarly, treatment with 2-deoxyglucose, a glycolysis inhibitor, also blocks this proliferative effect (Figure 4G-H). Therefore, the effect of ∆9-THC depends on both CB1 signaling and glycolysis. This set of experiments strengthens the conclusions of the study by helping to elucidate the mechanism of the effects of ∆9-THC.

      Although several experiments independently showed a metabolic effect of ∆9-THC treatment, this effect was not dose-dependent over the range of concentrations tested (10 nM and above). Given that metabolic effects were observed even at 10 nM ∆9-THC (see for example Figure 1C and 3B), the authors should test lower concentrations to determine the dose-dependence and EC50 of this effect. The authors should also compare their observed EC50 with the binding affinity of ∆9-THC to cellular receptors such as CB1, CB2, and GPR55 (reported by other studies).

      The study also profiles the transcriptome and metabolome of cells exposed to 100 nM ∆9-THC. Although the transcriptomic changes are modest overall, there is upregulation of anabolic genes, consistent with the increased proliferation rate in mESCs. Metabolomic profiling revealed a broad upregulation of metabolites in mESCs treated with 100 nM ∆9-THC.

      Additionally, the study shows that ∆9-THC can influence germ cell specification. mESCs were differentiated to mEpiLCs in the presence or absence of ∆9-THC, and the mEpiLCs were subsequently differentiated to mPGCLCs. mPGCLC induction efficiency was tracked using a BV:SC dual fluorescent reporter. ∆9-THC treated cells had a moderate increase in the double positive mPGCLC population and a decrease in the double negative population. A cell tracking dye showed that mPGCLCs differentiated from ∆9-THC treated cells had undergone more divisions on average. As with the mESCs, these mPGCLCs also had altered gene expression and metabolism, consistent with an increased proliferation rate.

      My main criticism is that the current experimental setup does not distinguish between "metabolic memory" vs. carryover of THC (or its metabolites) causing metabolic effects. The authors assume that their PGCLC induction was performed "in the absence of continuous exposure" but this assumption may not be justified. ∆9-THC might persist in the cells since it is highly hydrophobic. In order to rule out the persistence of ∆9-THC as an explanation of the effects seen in PGCLCs, the authors should measure concentrations of ∆9-THC and THC metabolites over time during the course of their PGCLC induction experiment. This could be done by mass spectrometry. This is particularly important because 10 nM of ∆9-THC was shown to have metabolic effects (Figure 1C, 3B, etc.). Since the EpiLCs were treated with 100 nM, if even 10% of the ∆9-THC remained, this could account for the metabolic effects. If the authors want to prove "metabolic memory", they need to show that the concentration of ∆9-THC is below the minimum dose required for metabolic effects.

      Overall, this study is promising but needs some additional work in order to justify its conclusions. The developmental effects of ∆9-THC exposure are important for society to understand, and the results of this study are significant for public health.

    3. Reviewer #2 (Public Review):

      In the study conducted by Verdikt et al, the authors employed mouse Embryonic Stem Cells (ESCs) and in vitro differentiation techniques to demonstrate that exposure to cannabis, specifically Δ9-tetrahydrocannabinol (Δ9-THC), could potentially influence early embryonic development. Δ9-THC was found to augment the proliferation of naïve mouse ESCs, but not formative Epiblast-like Cells (EpiLCs). This enhanced proliferation relies on binding to the CB1 receptor. Moreover, Δ9-THC exposure was noted to boost glycolytic rates and anabolic capabilities in mESCs. The metabolic adaptations brought on by Δ9-THC exposure persisted during differentiation into Primordial Germ Cell-Like Cells (PGCLCs), even when direct exposure ceased, and correlated with a shift in their transcriptional profile. This study provides the first comprehensive molecular assessment of the effects of Δ9-THC exposure on mouse ESCs and their early derivatives. The manuscript underscores the potential ramifications of cannabis exposure on early embryonic development and pluripotent stem cells. However, it is important to note the limitations of this study: firstly, all experiments were conducted in vitro, and secondly, the study lacks analogous experiments in human models.

    4. Reviewer #3 (Public Review):

      Verdikt et al. focused on the influence of Δ9-THC, the most abundant phytocannabinoid, on early embryonic processes. The authors chose an in vitro differentiation system as a model and compared the proliferation rate, metabolic status, and transcriptional level in ESCs, exposure to Δ9-THC. They also evaluated the change of metabolism and transcriptome in PGCLCs derived from Δ9-THC-exposed cells. All the methods in this paper do not involve the differentiation of ESCs to lineage-specific cells. So the results cannot demonstrate the impact of Δ9-THC on preimplantation developmental stages. In brief, the authors want to explore the impact of Δ9-THC on preimplantation developmental stages, but they only detected the change in ESCs and PGCLCs derived from ESCs, exposure to Δ9-THC, which showed the molecular characterization of the impact of Δ9-THC exposure on ESCs and PGCLCs.

    1. eLife assessment

      This manuscript describes useful information on in vitro binding and hydrogen exchange (HDX) mass spectrometry experiments using various bacterial-expressed BRAF N-terminal fragments, to tease out domain-specific interactions with RAS proteins or wildtype and oncogenic mutant BRAF kinase fragments. The characterization of the auto-inhibitory mechanism of the regulation of BRAF is solid but several concerns remain. The data will be of interest for researchers in the RAS/RAF and general kinase regulation fields.

    2. Reviewer #1 (Public Review):

      Trebino et al. investigated the BRAF activation process by analysing the interactions of BRAF N-terminal regulatory regions (CRD, RBD, and BSR) with the C-terminal kinase domain and with the upstream regulators HRAS and KRAS. To this end, they generated four constructs comprising different combinations of N-terminal domains of BRAF and analysed their interaction with HRAS as well as conformational changes that occur. By HDX-MS they confirmed that the RBD is indeed the main mediator of interaction with HRAS. Moreover, they observed that HRAS binding leads to conformational changes exposing the BSR to the environment. Next, the authors used OpenSPR to determine the binding affinities of HRAS to the different BRAF constructs. While BSR+RBD, RBD+CRD, and RBD bound HRAS with nanomolar affinity, no binding was observed with the construct comprising all three domains. Based on these experiments, the authors concluded that BSR and CRD negatively regulate binding to HRAS and hypothesised that BSR may confer some RAS isoform specificity. They corroborated this notion by showing that KRAS bound to BRAF-NT1 (BSR+RBD+CRD) while HRAS did not. Next, the authors analysed the autoinhibitory interaction occurring between the N-terminal regions and the kinase domain. Through pulldown and OpenSPR experiments, they confirm that it is mainly the CRD that makes the necessary contacts with the kinase domain. In addition, they show that the BSR stabilizes these interactions and that the addition of HRAS abolishes them. Finally, the D594G mutation within the KD of BRAF is shown to destabilise these autoinhibitory interactions, which could explain its oncogenic potential.

      Overall, the in vitro study provides new insights into the regulation of BRAF and its interactions with HRAS and KRAS through a comprehensive in vitro analysis of the BRAF N-terminal region. Also, the authors report the first KD values for the N- and C-terminal interactions of BRAF and show that the BSR might provide isoform specificity towards KRAS. While these findings could be useful for the development of a new generation of inhibitors, the overall impact of the manuscript could probably be enhanced if the authors were to investigate in more detail how the BSR-mediated specificity of BRAF towards certain RAS isoforms is achieved. Moreover, though the very "clean" in vitro approach is appreciated, it also seems useful to examine whether the observed interactions and conformational changes occur in the full-length BRAF molecule and in more physiological contexts. Some of the results could be compared with studies including full-length constructs.

    3. Reviewer #2 (Public Review):

      In the manuscript entitled 'Unveiling the Domain-Specific and RAS Isoform-Specific Details of BRAF Regulation', the authors conduct a series of in vitro experiments using N-terminal and C-terminal BRAF fragments (SPR, HDX-MS, pull-down assays) to interrogate BRAF domain-specific autoinhibitory interactions and engagement by H- and KRAS GTPases. Of the three RAF isoforms, BRAF contains an extended N-terminal domain that has yet to be detected in X-ray and cryoEM reconstructions but has been proposed to interact with the KRAS hypervariable region. The investigators probe binding interactions between 4 N-terminal (NT) BRAF fragments (containing one more NT domain (BRS, RBD, and CRD)), with full-length bacterial expressed HRAS, KRAS as well as two BRAF C-terminal kinase fragments to tease out the underlying contribution of domain-specific binding events. They find, consistent with previous studies, that the BRAF BSR domain may negatively regulate RAS binding and propose that the presence of the BSR domain in BRAF provides an additional layer of autoinhibitory constraints that mediate BRAF activity in a RAS-isoform-specific manner. One of the fragments studied contains an oncogenic mutation in the kinase domain (BRAF-KDD594G). The investigators find that this mutant shows reduced interactions with an N-terminal regulatory fragment and postulate that this oncogenic BRAF mutant may promote BRAF activation by weakening autoinhibitory interactions between the N- and C-terminus.

      While this manuscript sheds light on B-RAF specific autoinhibitory interactions and the identification and partial characterization of an oncogenic kinase domain (KD) mutant, several concerns exist with the vitro binding studies as they are performed using tagged-isolated bacterial expressed fragments, 'dimerized' RAS constructs, lack of relevant citations, controls, comparisons and data/error analysis. Detailed concerns are listed below.

      1. Bacterial-expressed truncated BRAF constructs are used to dissect the role of individual domains in BRAF autoinhibition. Concerns exist regarding the possibility that bacterial expression of isolated domains or regions of BRAF could miss important posttranslational modifications, intra-molecular interactions, or conformational changes that may occur in the context of the full-length protein in mammalian cells. This concern is not addressed in the manuscript.

      2. The experiments employ BRAF NT constructs that retain an MBP tag and RAS proteins with a GST tag. Have the investigators conducted control experiments to verify that the tags do not induce or perturb native interactions?

      3. The investigators state that the GST tag on the RAS constructs was used to promote RAS dimerization, as RAS dimerization is proposed to be key for RAF activation. However, recent findings argue against the role of RAS dimers in RAF dimerization and activation (Simanshu et al, Mol. Cell 2023). Moreover, while GST can dimerize, it is unclear whether this promotes RAS dimerization as suggested. In methods for the OpenSPR experiments probing NT BRAF:RAS interactions, it is stated that "monomeric KRAS was flowed...". This terminology is a bit confusing. How was the monomeric state of KRAS determined and what was the rationale behind the experiment? Is there a difference in binding interactions between "monomeric vs dimeric KRAS"?

      4. The investigators determine binding affinities between GST-HRAS and NT BRAF domains (NT2 7.5 {plus minus} 3.5; NT3 22 {plus minus} 11 nM) by SPR, and propose that the BRS domain has an inhibitory role HRAS interactions with the RAF NT. However, it is unclear whether these differences are statistically meaningful given the error.

      5. It is unclear why NT1 (BSR+RBD+CRD) was not included in the HDX experiments, which makes it challenging to directly compare and determine specific contributions of each domain in the presence of HRAS. Including NT1 in the experimental design could provide a more comprehensive understanding of the interplay between the domains and their respective roles in the HRAS-BRAF interaction. Further, excluding certain domains from the constructs, such as the BSR or CRD, may overlook potential domain-domain interactions and their influence on the conformational changes induced by HRAS binding.

      6. The authors perform pulldown experiments with BRAF constructs (NT1: BSR+RBD+CRD, NT2: BSR+RBD, NT3: RBD+CRD, NT4: RBD alone), in which biotinylated BRAF-KD was captured on streptavidin beads and probed for bound His/MBP-tagged BRAF NTs. Western blot results suggest that only NT1 and NT3 bind to the KD (Figure 5). However, performing a pulldown experiment with an additional construct, CRD alone, it would help to determine whether the CRD alone is sufficient for the interaction or if the presence of the RBD is required for higher affinity binding. This additional experiment would strengthen the authors' arguments and provide further insights into the mechanism of BRAF autoinhibition.

      7. While the investigators state that their findings indicate that H- and KRAS differentially interact with BRAF, most of the experiments are focused on HRAS, with only a subset on KRAS. As SPR & pull-down experiments are only conducted on NT1 and NT2, evidence for RAS isoform-specific interactions is weak. It is unclear why parallel experiments were not conducted with KRAS using BRAF NT3 & NT4 constructs.

      8. The investigators do not cite the AlphaFold prediction of full-length BRAF (AF-P15056-F1) or the known X-ray structure of the BRAF BRS domain. Hence, it is unclear how Alpha-Fold is used to gain new structural information, and whether it was used to predict the structure of the N-terminal regulatory or the full-length protein.

      9. In HDX-MS experiments, it is unclear how the authors determine whether small differences in deuterium uptake observed for some of the peptide fragments are statistically significant, and why for some of the labeling reaction times the investigators state " {plus minus} HRAS only" for only 3 time points?

      10. The investigators find that KRAS binds NT1 in SPR experiments, whereas HRAS does not. However, the pull-down assays show NT1 binding to both KRAS and HRAS. SI Fig 5 attributes this to slow association, yet both SPR (on/off rates) and equilibrium binding measurements are conducted. This data should be able to 'tease' out differences in association.

      11. The model in Figure 7B highlights BSR interactions with KRAS, however, BSR interactions with the KRAS HVR (proximal to the membrane) are not shown, as supported by Terrell et al. (2019).

      12. The investigators state that 'These findings demonstrate that HRAS binding to BRAF directly relieves BRAF autoinhibition by disrupting the NT1-KD interaction, providing the first in vitro evidence of RAS-mediated relief of RAF autoinhibition, the central dogma of RAS-RAF regulation. However, in Tran et al (2005) JBC, they report pull-down experiments using N-and C-terminal fragments of BRAF and state that 'BRAF also contains an N-terminal autoinhibitory domain and that the interaction of this domain with the catalytic domain was inhibited by binding to active HRAS'. This reference is not cited.

      13. In Fig 2, panels A and C, it is unclear what the grey dotted line in is each plot.

      14. In Fig 3, error analysis is not provided for panel E.

      15. How was RAS GMPPNP loading verified?

    1. eLife assessment

      The study by Gu et al. presents direct evidence on the role of microglia morphological dynamics during sleep/wake cycles and the modulatory effect of sleep deprivation, making it a valuable contribution to the ongoing investigation of microglial function. The use of a novel miniature two-photon microscope technique adds strength to the evidence supporting the conclusions. However, concerns remain about certain methodological and experimental aspects of the study, indicating that further validation is necessary and that the evidence presented is currently incomplete.

    1. eLife assessment

      The discovery of Homo naledi-associated evidence for intentional burial and engravings would undoubtedly have important implications for our understanding of the evolution of complex cognition and behavior. Based on claims made in two related preprints by Berger et al., this study discusses the potential implications of the purported mortuary practice and symbolic behaviors claimed to be associated with the small-brained H. naledi remains. Unfortunately, the evidence presented in the two related submissions that the current paper entirely relies on is incomplete at this stage.

    2. Reviewer #1 (Public Review):

      As this experience as a reviewer has been unusual, it may be helpful to outline some relevant parameters of the task at the outset. While I was invited to review the Fuentes et al. study only, two additional papers concerning the claimed engravings and burials associated with Homo naledi by Berger and colleagues were also provided as components of the reviewer package. The two manuscripts presenting the archaeological evidence are accessible as preprints in bioRxiv, by Lee Berger and colleagues ('2023a, 2023b').

      Unfortunately, the arguments in the Fuentes et al manuscript hinge entirely on the strength of archaeological evidence for engravings and intentional burial by Homo naledi (presented in the abovementioned two preprints). All inferences regarding hominin behaviour and biology of Homo naledi, discussed by Fuentes and colleagues, are wholly dependent on the evidence presented in the archaeology preprints being true.

      Yet both of the archaeological manuscripts are unfortunately weak. In short, the claims for engravings depend on the demonstration of several elements of association that are rather standard for linking material traces found in the archaeological record with particular hominin behaviours. For the particular arguments by Berger and colleagues to be demonstrated, the traces on the rock surface need to be linked causally with hominin agency, in other words, their anthropogenic nature need to be established. The author of the engravings needs to be demonstrated as a particular hominin species (Homo naledi in this case), and the activity of engraving needs to have taken place ~241-335 kya. After reading the manuscript on the engravings, however, what is clear is that the scratches could as easily have been made by a modern-day farmer 50 years ago, as Homo naledi ~335 kya. Berger and colleagues do not present any evidence to the contrary, they simply describe their narrative as the most parsimonious scenario. A particularly curious piece of information presented as evidence is a list of individuals known to have entered the Dinaledi system in recent times (and known not to have scratched the walls, one presumes, though this is not stated).

      The question of intentional burial is more complex. What we know from other widely accepted early burials is that documenting the geoarchaeological context of the hominin remains is critical to assess the likelihood of an intentional burial - this needs to be established at the outset through high quality fieldwork. Yet even the boundaries of the excavation presented in the burial manuscript appear so angled or skew relative to one another (Fig. 2a) that the individual squares look to be aligned with different XY grids, which does not instill confidence in the quality of field documentation. One can make out very little from the sediment section images - which are key to identifying intrusive features associated with burials - and the multivariate geochemical analysis of sediments is unconvincing: a scatterplot (not a biplot) should have been provided showing the geochemistry of the burial sediment samples relative to the immediately surrounding sediment characteristics. While one remains excited about the potential for a spectacular archaeological discovery within the Dinaledi cave system, unfortunately, the three manuscripts provided do not present convincing evidence to that effect.

    3. Reviewer #2 (Public Review):

      Fuentes et al. provide a detailed and thoughtful commentary on the evolutionary and behavioral implications of complex behaviors associated with a small-brained hominin, Homo naledi. Within the Rising Star Cave of South Africa, Berger et al. 2023a,b proposed evidence that Homo naledi intentionally buried their dead through complex mortuary practices and engaged in symbolic expression by engraving the cave walls in cross-hatching motifs. Two burials were identified in the Rising Star cave subsystems: Feature 1 in the Dinaledi Chamber and a feature in the Hill Antechamber. The engravings are located in the Hill Antechamber near the passageway leading into the Dinaledi chamber. The authors aimed to provide evidence for burials by (1) testing sediment samples for mineral composition from within and outside the burial feature; (2) demonstrating an interruption in the stratigraphy indicative of a "bowl-shaped" feature; (3) evaluating the anatomical coherence of the skeletal remains; (4) demonstrate matrix-supported positioning of skeletal elements; and (5) determine the compatibility of non-articulated material with decomposition and subsequent collapse. Berger et al. 2023b evaluated the engravings through high resolution photography, cross-polarization, and 3D photogrammetry. Neither article involved radiometric dating of materials. While the review by Fuentes et al. highlights important assumptions about the relationship between hominin brain size, cognition, and complex behaviors, the evidence presented by Berger et al. 2023a,b does not support the claim that Homo naledi engaged in burial practices or symbolic expression through wall engravings.

      The major weaknesses for Berger et al. 2023a are as follows:

      1) The mineral composition from sediment sampled from within Dinaledi Feature 1 is not different compared to the surrounding sediment, which is one rationale proposed by the authors that would lead to the conclusion of a burial pit. An effort to replicate the multivariate statistical analysis using the data provided in SI Table 1 by this reviewer failed, and thus, the results are not replicable.

      2) The authors failed to provide clear visualizations or analysis that showed an unambiguous interruption in the stratigraphy surrounding the Dinaledi Feature 1.

      3) Attempts 1 and 2 were applied solely to Dinaledi Feature 1, not the Hill Antechamber Feature.

      4) Skeletal cohesion does suggest that the bodies were likely covered or protected by external environment. However, given the geological context, there is minimal opportunity for scavengers or other agents to scatter the skeletal remains within such an isolated location. Thus, this alone cannot solely support intentional burials as this line of evidence is subject to equifinality.

      5) Similar to the preceding statement, evidence for matrix-supported elements was inconclusive at best. There was no mention of sedimentary rate or expectations for how quickly sediments would naturally bury the remains of whole bodies in the chamber compared with the rate of decomposition of buried remains.

      The major weaknesses for Berger et al. 2023b are as follows:

      6) While this is incredibly difficult to accomplish, dating rock art or other cave wall engravings is the only method to ensure that the etchings were created during the time of Homo naledi. Unfortunately, this was not attempted. Instead, the authors state that "This description is intended to document the discovery and provide spatial and contextual information prior to any further analyses that may require invasive sampling." Yet, the authors assign a date to the engravings in the title of the paper. Here, the authors are generating interpretations before analyses are attempted.

      7) The engravings are indeed very interesting and are likely anthropogenic in origin. However, the argument that these engravings were created by Homo naledi is based on the bold assumption that "No physical or cultural evidence of any other hominin population occurs within this part of the cave system, and there is no evidence that recent humans or earlier hominins ever entered any adjacent area of the cave until surveys by human cave explorers during the last 40 years." (page 6). To assume that no other individual entered the cave system from the time of Homo naledi until 40 years ago is an unrealistic and faulty assumption. This reviewer does not discount that the engravings could have been made by Homo naledi, but the evidence must be sufficient to support this statement or provide other alternatives as working hypotheses.

      As a discipline, paleoanthropology aims to understand the evolutionary history of the hominin clade through fossil remains, material culture, and, most recently, ancient DNA. The methods and approaches that we as paleoanthropologists use to understand the past often bridge both the humanities and the hard sciences to create a unique understanding of our shared history. We are only limited by the conditions in which time and attrition has erased pieces of our collective story from the earth. Thus, it is our responsibility to ensure that our interpretations of the past are supported by measurable and testable means, to the best of our ability, and that hypotheses are not presented as conclusions.

      Unfortunately, this is not the case for Berger et al. 2023a,b. The work presented by the authors is imprudent and incomplete and does not meet the requirements set forth by our discipline. While it is important that scholars publish their work in a dutiful timeline, it is arguably more critical for scholars to take the necessary time to ensure the integrity and resolution of the work. The consequences for rushing publications with such a significant unsubstantiated find will likely result in perilous ramifications, as it is more difficult to correct an idea than to introduce one.

    4. Reviewer #3 (Public Review):

      This paper presents the cognitive implications of claims made in two accompanying papers (Berger et al. 2023a, 2023b) about the creation of rock engravings, the intentional disposal of the dead, and fire use by Homo naledi. The importance of the paper, therefore, relies on the validity of the claims for the presence of socio-culturally complex and cognitively demanding behaviors that are presented in the associated papers. Given the archaeological, hominin, and taphonomic analyses in the associated papers are not adequate to enable the exceptional claims for naledi-associated complex behaviors, the inferences made in this paper are currently inadequate and incomplete.

      The claimed behaviors are widely recognized as complex and even quintessential to Homo sapiens. The implications of their unequivocal association with such a small-brained Middle Pleistocene hominin are thus far reaching. Accordingly, the main thrust of the paper is to highlight that greater cognition and complex socio-cultural behaviors were not necessarily associated with a positively encephalized brain. This argument begs the obvious question of whether absolute brain size and/or encephalization quotient (i.e., the actual brain volume of a given species relative the expected brain size for a species of the same average body size) can measure cognitive capacity and the complexity of socio-cultural behaviors among late Middle Pleistocene hominins.

      Claims for a positive correlation between absolute and/or relative brain size and cognitive ability are not common in discussions surrounding the evolution of Middle- and Late Pleistocene hominin behavior. Currently, the bulk of the evidence for early complex technological and social behaviors derives from multiple sites across South Africa and postdates the emergence of H. sapiens by more than 100,000 years. Such lag in the expression of complex technologies and behaviors within our species renders the brain size-implies-cognitive capacity argument moot. Instead, a rich body of research over the past several decades has focused on aspects related to socio-cultural, environmental, and even the wiring of the brain in order to understand factors underlying the expression of the capacity for greater behavioral variability. In this regard, even if the claimed evidence for complex behaviors among the small-brained naledi populations proves valid, the exploration of the specific/potential socio-cultural, neuro-structural, ecological and other factors will be more informative than the emphasis on absolute/relative brain size.

      The paper presents as supporting evidence previous claims for the appearance of similar complex behaviors predating the emergence of our species, H. sapiens, although it does acknowledge their controversial nature. It then uses the current claims for the association of such behaviors with H. naledi as decisive. Given the inadequate analyses in the accompanying papers and the lack of evidence for stone tools in the naledi sites, the present claims for the expression of culturally and symbolically mediated behaviors by this small-brained hominin must be adequately established. The importance of the paper thus rests on the validity of the claimed evidence--including contextual aspects--for rock engraving, mortuary practices, and the use of fire presented in the associated two papers. The claims in both associated papers are inadequate, incomplete, and largely assumption- (rather than evidence) based. As responsible and ethical researchers, the team must return to the sites, conduct the required standard chronomoetric and taphonomic studies and weigh the strength of the evidence before proceeding with the current claims.

    5. Author Response:

      We would like to thank the eLife reviewers for the considerable time and effort they have invested to review these manuscripts. We have also benefited from a previous round of review of the manuscript describing the proposed burial features, which underwent two rounds of revisions in a high-impact journal over a period of approximately 8 months during 2022 and early 2023. Both sets of reviews have reflected mixed responses to the evidence we have presented, with one reviewer recommending acceptance with minor editorial revisions, two recommending acceptance with minor revisions and the fourth recommending rejection based upon similar arguments to those reflected by some of the reviewers in this current round of reviews in eLife. Ultimately the managing editor of this first journal took the decision that the review process could not be completed in a timely manner and rejected the manuscript although the submission here reflected our consideration of these reviewers suggestions.

      We have chosen in this initial response to the eLife reviews to include some references to the previous anonymous reviews in order to illustrate differences of opinion and differences in revision suggestions within the review process. Our goal is to offer maximal insight into our decision-making process and to acknowledge the considerable time and effort put into the assessment of these manuscripts by reviewers (for eLife and in the case of the earlier review process). We hope that this approach will assist the readers, and reviewers, of our manuscripts in understanding why we are proceeding with certain decisions during the revision process.

      This is a new process for us and the reviewers, and one way in which it significantly differs from more traditional review is that both the reviews and our reply will be public well in advance of our revisions to the manuscript. Indeed, considering the scope of the reviews, some of those revisions may take considerable time, although many can be accomplished fairly easily. Thus, we are not in a position to say that we have solved every issue raised by the reviewers. Instead, we will examine what appear to be the key critical issues raised regarding the data and the analyses and how we propose to address these as we revise the papers. We will also address several philosophical and ethical issues raised by the reviews and our proposal for dealing with these. More specific editorial and citational recommendations will be dealt with on a case-by-case basis, and we do not address these point-by-point in this reply. Please note, this response to the reviewers is not the revision of the manuscript and is only the initial opinion of the corresponding authors with some guidance from the larger group of authors of all three papers. Our final submitted revision will reflect the input of all authors included on those submissions.

      We took the decision to submit three separate papers consciously. The two different categories of evidence, burials and engravings, involve different kinds of analysis and different (although overlapping) teams of researchers, and we recognized that each deserved their own presentation and assessment. Meanwhile, together they inform the context of H. naledi in a way that requires some synthetic discussion, in which both kinds of evidence are relevant, leading to a third paper. But the mutual relevance of these different kinds of evidence and their review by a common set of reviewers naturally raises cross-cutting issues, and the reviewers have cross-referenced the three articles. This has sometimes led to suggestions about one manuscript based on the contents of another. Considering the situation, we accepted the recommendation that it would be clearer to consider all three articles in a single reply. Thus, while each of the three papers will proceed separately during the revision process, it will be necessary to highlight across all three papers occasionally in our responses.

      Scientific Issues:

      In reading the reviews, we feel there are 9 critical points/assertions raised by one or more of the reviewers that present a problem for, or challenge to, our hypothesis that the observed evidence (bone accumulations and engravings) described in the Dinaledi subsystem are of intentional naledigenic origin. These are:

      1. The evidence presented does not demonstrate a clear interruption of the floor sediments, thus failing to demonstrate excavated holes.

      2. The sediments infilling the holes where the skeletal remains are found have not been demonstrated to originate from the disruption of the floor sediments and thus could be part of a natural geological process (e.g. water movement, slumping) or carnivore accumulations.

      3. Previous geological interpretations by our research group have given alternative geological explanations for formation of the bony accumulations that contradict the present evidence presented here and result in alternative origins hypotheses.

      4. Burial cannot be effectively assessed without complete excavation of the features and site.

      5. The skeletal remains as presented do not conform clearly to typical body arrangement/positions associated with human (Homo sapiens) burials.

      6. There is no evidence of grave goods or lithic scatters that are typically associated with human burials.

      7. Humans may have been involved with the creation of either the Homo naledi bone accumulations, the engravings, or both.

      8. Without a date of the engravings, the null hypothesis should be the engravings were created by Homo sapiens.

      9. The null hypothesis for explanation of the skeletal remains in this situation should be “natural accumulation”.

      Our analysis of the Dinaledi Feature 1 leads us to accept that the laminated orange-red mudstone (LORM) sedimentary layer is interrupted, indicating a non-natural intervention, and that the hole created by the interruption was then filled by both a fleshed body (and perhaps parts of other bodies) which were then covered by sediment that originated from the hole that was dug. We recognize that the four eLife reviewers are not convinced that our presentation is sufficient to establish this. Interestingly, this was not the universal opinion of earlier reviewers of the initial manuscript several of whom felt we had adequately supported this hypothesis. The lack of clarity in this current version of the burial manuscript is our responsibility. In the upcoming revision of this paper to be submitted, we will take the reviewers’ critiques to heart and add additional figures that illustrate better the disruption of the LORM and clarify the sedimentological data showing the material covering the skeletal remains in the hole are the disrupted sediments excavated from the same hole. We are proposing to isolate this most critical evidence for burial into a separate section in the revised submission based on the reviewers’ comments. The fact that the LORM layer is disrupted, a fleshed body was placed in the hole created by this disruption, and the body (and perhaps parts of other bodies) was/were then covered by the same sediments from the hole is the central feature of our hypothesis that the bone accumulations observed reflect a burial and not a natural process.

      The possibility of fluvial transport or involvement in the subsystem is a topic that we have addressed extensively in past work, and it is clear from these reviews that we must enhance our current manuscript to discuss this issue at greater length. Our previous work (Dirks et al. 2015; Dirks et al. 2017) emphasized that fluvial transport of whole bodies into the subsystem was precluded by several lines of sedimentological evidence. We excavated a rich accumulation of skeletal remains, including articulated limbs and other elements in subvertical orientations inconsistent with slow sedimentary infill, which were difficult to explain without positing either a large and dense pile of bodies and/or sediment movement. We encountered fractured chunks of laminated orange-red mudstone (LORM) in random orientations within our excavation area, within and among skeletal remains, which directly refuted that the remains were inundated with water at the time of burial, and this limited the possibility of fluvial transport. Water flow sufficient to displace bodies or complete skeletal evidence would also transport large and course sediment, which is absent from the subsystem, and would sort the commingled skeletal material that we found by size, which we do not observe. But our excavation only covered less than a square meter at very limited depth, and this was the limit to our knowledge of subsurface sediment. We thus were left with uncertainty that led us to suggest the possibility of sediment slumping or movement into subsurface drains, although these were not observed near our excavation. Our current work expands our knowledge of the subsurface and presents an alternative explanation for the disposition of skeletal remains from our earlier excavation. But we acknowledge that this new explanation is vulnerable to our own previous published proposals, and we must do a better job of explaining how the new information addresses our previous suggestions. By not clearly creating a section where we explained how these previous hypotheses were now nullified by new evidence, we clearly confused the reviewers with our own previous work. We will revise the manuscript by enhancing the review of the significant geological evidence demonstrating that there is no significant fluvial action in the system and making it clear how the burial hypothesis provides a clearer explanation for the situation of skeletal remains from our previous excavation work.

      One of the central issues raised by reviewers has been a perceived need to excavate these features completely, totally exhuming all skeletal remains from them. Reviewers have written that it is necessary to identify every skeletal element that is present and account for any missing elements. On this point, we have both ethical and scientific differences from these reviewers. We express our ethical concerns first. Many of the best-preserved possible burials ever discovered by archaeologists were subjected to total excavation and exhumation. Cases like La Chapelle-aux-Saints, La Ferrassie, and Skhūl were fully excavated at a time when data recording and excavation methods did not include the range of spatial and geomorphological approaches that later became routine. The judgment of early investigators that these situations were intentional burials was challenged by later workers, and the kind of information that might enable better tests had been irrevocably lost (Gargett 1999; Dibble et al. 2015; Rendu et al. 2014).

      Later, improved excavation standards have not sufficed to remove uncertainty or debate about possible burials. For example, it was long presumed that well-preserved remains of young children were by themselves diagnostic of intentional burial, such as those from Dederiyeh, Border Cave, or Roc de Marsal. Such cases were also fully excavated, with adequate documentation of the positioning of skeletal remains and their surrounding stratigraphic situation, but such cases were later challenged on several bases and the complete exhumation of material has confused or precluded testing of new hypotheses (e.g. Gargett 1999). The case of Roc de Marsal is one in which data from the initial excavation combined with data from the initial excavation combined with re-excavation and geoarchaeological analysis led to a naturalistic interpretation of the skeletal material (Sandgathe et al. 2011; Goldberg et al. 2017). But even in this case, the researchers erred in their interpretation of the skeleton’s situation due to a lack of identification of parts of the infant’s skeleton (Gómez-Olivencia and García-Martinez 2019). That is to say, it is not only the burial hypothesis but other hypotheses that suffer from complete excavation. Researchers concerned with preserving all possible information have sometimes taken extraordinary measures to remove and study possible burials at high-resolution in the laboratory. Such was the case of the Shanidar IV burial removed from the site and transported in plaster jacket by Solecki, which led to the disruption and loss of internal stratigraphic information (Pomeroy et al. 2020). Arguably, the current state of the art is full excavation with partial preparation, such as that undertaken at Panga ya Saidi (Martinón-Torres et al. 2021). But again, any future attempt to reinterpret or test the hypothesis of burial must rely on the adequacy of documentation as the original context has been removed.

      In our decision to leave material in place as much as possible, we are expanding upon standard practice to leave witness sections and unexcavated areas for future research. The situation is novel, representing possible burials by a nonhuman species, and that makes it doubly important in our opinion to be conservative in not fully exhuming the skeletal material from its context. We anticipate that many other researchers, including future investigators, will suggest additional methods to further test the hypothesis of burial, something that would be impossible if we had excavated the features in their entirety prior to publishing a description of our work. We believe strongly that our ethical responsibility is to publish the work and the most likely interpretation while leaving as much evidence in place as possible to enable further testing and replication. We welcome the suggestions of additional methods/analyses to test the H. naledi burial hypothesis.

      This being said, we also observe that total exhumation would not resolve the concerns raised by the reviewers. The recommendation of total exhumation is in pursuit of a full account of all skeletal material present and its preservation and spatial situation, in order to demonstrate that they conform to body positions comparable to human burials. As has been highlighted in forensic casework, the excavation of an inhumation feature does not necessarily provide an accurate spatial or anatomical manifest of the stratigraphical relationships between the body, encapsulating matrix, and any cut present due to preservational, taphonomic and operational factors (Dirkmaat and Cabo, 2016; Hunter, 2014). In particular, in cases where skeletal elements are highly fragmented, friable, or degraded (such as through bioerosion) then complete excavation—even under controlled laboratory conditions—may destroy bone and severely limit skeletal identification (Henderson, 1997; Hochrein, 2002; Owsley and Compton, 1997), particularly in elements where the ratio of trabecular to cortical bone is high (Darwent and Lyman, 2002; Lyman, 1994). As such, non-invasive methods of 3D and 4D modelling (preservation in situ) are often considered preferable to complete necropsy or excavation (preservation by record) where appropriate (Bolliger and Thali, 2009; Dell’Unto and Landeschi, 2022; Randolph-Quinney et al., 2018; Silver, 2016). 

      The test of burial is not primarily positional, but taphonomic and geological. The position and number of bones can elaborate on process-driven questions of decay and destruction in the burial environment, or post-mortem modification, but are not singularly indicative of whether the remains were intentionally buried – the post-mortem narrative of all the processes affecting the cadaveric island is required (Knüsel and Robb, 2016). In previous cases, researchers have disputed or accepted the hypothesis of intentional hominin burial based upon assumptions about how modern humans or Neandertals would have positioned bodies, with the idea that some positions reflect ritual intent while others do not. But applying such assumptions is unjustifiable, particularly for a species like H. naledi, whose culture may have differed fundamentally from our own. Our work acknowledges that the present evidence does not enable a full reconstruction of the burial positions, but it does show that fleshed remains were encased in sediment prior to decomposition of soft tissue, and that subsequent spatial changes can be most parsimoniously explained by natural decomposition within sedimentary matrix contained within a burial feature (after Green, 2022; Mickleburgh and Wescott, 2018; Mickleburgh et al., 2022). If the argument is that extraordinary claims require extraordinary evidence, we feel that the evidence documents excavation and interment (and will do so more clearly in the revision) and the fact of the remains do not match a “typical” human burial in body positioning is not in itself evidence that these are not H. naledi burials.

      We feel that the reviewers (in keeping with many palaeoanthropologists) have a clear idea of what they “think” a burial should look like in an idealised sense, but this platonic ideal of burial form is not matched by the extensive literature in archaeothanatology, funerary archaeology and forensic science which indicates enormous variability in the activity, morphology and post-mortem system experienced by the human body in cases of interment and body disposal (e.g. Aspöck, 2008; Boulestin and Duday, 2005 and 2006; Connelly et al., 2005; Channing and Randolph-Quinney, 2006; Cherryson, 2008; Donnelly et al., 1995; Finley, 2000; Hunter, 2014; Parker Pearson, 1999; Randolph-Quinney, 2013). Decades of experience in the identification, recovery and interpretation of clandestine, deviant, and non-formal burials indicates the platonic ideal is rare, and in many contexts, the exception (Cherryson, 2008; Parker Pearson, 1999). This variability is particularly relevant to morphological traits in burial context, such as the informal nature of the grave cut in plan and section, shallow burial depth, and initial disposition of body (placement) during the early post-mortem period. These might run counter to the expectations of reviewers or others referencing the fossil hominin record, but are well accepted within the communities of researchers investigating Holocene archaeological sites and forensic contexts.

      It is encouraging to see reviewers beginning to incorporate the extensive (often experimentally derived) literature from archaeothanatology and forensic taphonomy in their deliberations, and we will be taking these comments on board going forward. In particular, we acknowledge reviewers’ comments and the need to construct a more detailed post-mortem narrative, accounting for joint disarticulation (labile versus persistent joints etc), displacement, and final disposition of elements within the burial space. As such we will incorporate the hierarchy of decomposition (rank order disarticulation), associations between regions of anatomical association, areas of disassociation, and the voids produced during decomposition (after Mickleburgh and Wescott, 2018; Mickleburgh et al., 2022) into our narrative. In doing so we acknowledge the tensions between the inductive archaeolothanatological narrative-driven approach (e.g. Duday, 2005 & 2009) versus robust decomposition data derived from human forensic taphonomic experimentation recently articulated by Schotsmans and colleagues (2022) - noting that we will highlight comparative data based on forensic experimental casework and actualistic modelling over inductive intuitive approaches which come with significant evidential shortcomings (Bristow et al. 2011).

      Finally, from a taphonomic perspective it is worth pointing out to reviewers that we have already addressed the issue of lack of taphonomic evidence for carnivore involvement in the formation of the Dinaledi assemblage (Dirks, et al., 2016). Absence of any carnivore-induced bone surface modifications, patterns of skeletal part representation, and a total absence of any carnivore remains found within the Dinaledi chamber (following Kuhn and colleagues, 2010) lead us to reject carnivores as possible vectors of body accumulation within the Dinaledi Chamber and Hill Antechamber.

      Reviewers suggest that without a date derived from geochronological methods, the engravings cannot be associated with H. naledi, and that it is possible (or probable) that the engravings were done in the recent past by H. sapiens. This suggestion neglects the context of the site. We have previously documented the structure and extremely limited accessibility of the Dinaledi subsystem. This subsystem was not recorded on maps of the documented Rising Star Cave system prior to our work and its discovery by our teams. Furthermore, there is no evidence of prehistoric human activity in the areas of the cave related to possible subterranean entrances There is no evidence that humans in the past typically ventured into such extreme spaces like those of Rising Star. It is clear from the presence of the remains of many individuals that H. naledi ventured into these spaces again and again. It is likely that H. naledi moved through these spaces more easily than humans do based on their physique. We show that the engravings overlay each other suggesting multiple engraving events.  These engravings took time and effort and the only evidence for use of the Dinaledi subsystem by any hominin is by H. naledi. The context leads to the null hypothesis that H. naledi made the marks. In our revision, we will elaborate on this argument to clarify the evidence for our stance on this hypothesis. Several reviewers took issue with the title of the engraving paper as we did not insert a qualifier in front of the suggested date range for the engravings. We deliberately left out qualifying language so that the title took the form of a testable hypothesis rather than a weak assertation. Should future work find the engravings were not produced within this time range, then we will restate this hypothesis.

      Finally, with regards to the engravings we have chosen to report them because they exist. Not reporting the presence of engraved marks on the walls of a cave above hypothesized burials would be tantamount to leaving relevant evidence out of the description of an archeological context. We recognize and state in our manuscript that these markings require substantial further study, including attempts at geochronological dating. But the current evidence is clearly relevant to the archaeological context of the subsystem. We take a similar stance with reporting the presence of the tool shaped artefact near the hand of the H. naledi skeleton in the Hill Antechamber. It is evident that this object requires further study, as we stated in our manuscript, but again omitting it from our study would be leaving out relevant evidence.

      Some have suggested that the null hypothesis should be that all of these observed circumstances are of natural origin. Our team took this approach in our early investigation of the Dinaledi subsystem (Dirks et al. 2015). We adopted the null hypothesis that the geological processes involved in the accumulation of H. naledi skeletal remains were “natural” (e.g., non-naledigenic involvement), and we were able to reject many alternative explanations for the assemblage, including carnivore accumulation, “death trap” accumulation, and fluvial transport of bodies or bones (Dirks et al. 2015). This led us to the hypothesis that H. naledi were involved in bringing the bodies into the spaces where they were found. But we did not hypothesize their involvement in the formation of the deposit itself beyond bringing the bodies to the location.

      This approach seems conservative. It followed the traditional view that small-brained hominins do not engage in cultural practices. But we recognize in hindsight that this null hypothesis approach did harm to our analyses. It impeded us from recognizing within our initial excavations of the puzzle box area and other excavations between 2014 – 2017 that we might be encountering remains that were intrusive in the sedimentary floor of the chamber. If we had approached the accumulation of a large number of hominins from the perspective of the null hypothesis being that the situation was likely cultural, we perhaps would have collected evidence in a slightly different manner. We certainly note that if the Dinaledi system had been full of the remains of modern humans, there would have been little doubt that the null hypothesis would have been that this was a cultural space and not a “natural space”.  We therefore respectfully disagree with the reviewers who continue to support the idea that we should approach hominin excavations with the null hypothesis that they will be natural (specifically non-cultural) in origins. If excavations continue with this mindset we believe that potential cultural evidence is almost certain to be lost.

      There has been a gradient across paleoanthropological excavations, archaeological work, and forensic investigation, with increasing precision of context. The reality is that the recording precision and frame of approach is typically different in most paleontological excavations than in those related to contemporary human remains. If anything comes from the present discussion of whether the Dinaledi system is a burial site for H. naledi or not, we hope that by taking seriously the possibility of deep cultural dynamics of hominins, we will encourage other teams to meet the highest standards of excavation in order to preserve potential cultural evidence. Given H. naledi’s cranial capacity we suggest that even very early hominin skeletal assemblages should be re-examined, if there is sufficient evidence or records available.  These would include examples such as the A.L. 333 Au. afarensis site (the so called First Family site in Hadar Ethiopia), the Dikika infant skeleton, WT 15000 (Turkana Boy) and even A.L. 288 (Lucy) as such unusual taphonomic situations where skeletons are preserved cannot be simply explained away as “natural” in origin, based solely on the cranial capacity and assumed lack of cognitive and cultural complexity of the hominins as emphasized by us in Fuentes et al. (2023). We are not the first to observe that some very early hominin situations may represent early mortuary activity (Pettitt 2013), but we would advocate a step further. We suggest it may be damaging to take “natural accumulation” as the standard null hypothesis for hominin paleoanthropology, and that it is more conservative in practice to engage remains with the null hypothesis of possible cultural formation.

      We are deeply grateful for the time and effort all of the 8 reviewers (across three reviews) have taken with this work.  We also acknowledge the anonymous reviewers from previous submissions who’s opinions and comments will have made the final iterations of these manuscripts better for their efforts. As this process is rather public and includes commentary outside of the eLife forum, we ask that the efforts of all 37 authors and 8 reviewers involved be respected and that the discourse remain professional in all venues as we study this fascinating and quite complex occurrence. We appreciate also the efforts of members of the public who have engaged with this relatively new process where preprints are posted prior to the reviews allowing comments and interactions from colleagues and the public who are normally not part of the internal peer review process.  We believe these interactions will make for better final papers. We feel we have met the standards of demonstrating burials in H. naledi and that the engraving are most likely associated with H. naledi. However, given the reviews we see many areas where our clarity and context, and analyses, were less strong than they can be. With the clarifications and additions taken on board through these review processes the final papers will be stronger and clearer. We, recognize that this is an ongoing process of scientific investigation and further work will allow continued, and possibly better, evaluation of these hypothesis and others.

      Lee R Berger, Agustín Fuentes, John Hawks, Tebogo Makhubela

      Works cited:

      • Aspöck, E. (2008). What Actually is a ‘Deviant Burial’?: Comparing German-Language and Anglophone Research on ‘Deviant Burials.’ In E. M. Murphy (Ed.). Deviant Burial in the Archaeological Record. Oxford: Oxbow Books.  pp 17–34.

      • Bolliger, S.A. & Thali, M.J. (2009). Thanatology. In S.A. Bolliger and M.J. Thali (eds) Virtopsy Approach:  3D Optical and Radiological Scanning and Reconstruction in Forensic Medicine. Boca Raton: CRC Press. pp 187-218.

      • Boulestin, B. & Duday, H. (2005). Ethnologie et archéologie de la mort: de l’illusion des références à l’emploi d’un vocabulaire. In: C. Mordant and G. Depierre (eds) Les Pratiques Funéraires à l’Âge du Bronze en France. Actes de la table ronde de Sens-en-Bourgogne. Paris: Éditions du Comité des Travaux Historiques et Scientifiques. pp. 17–30.

      • Boulestin, B. & Duday, H. (2006). Ethnology and archaeology of death: from the illusion of references to the use of a terminology. Archaeologia Polona 44: 149–169.

      • Bristow, J., Simms, Z. & Randolph-Quinney, P.S. Taphonomy. In S. Black and E. Ferguson (eds.) Forensic Anthropology 2000-2010. Boca Raton, FL: CRC Press. pp 279-318.

      • Channing, J. & Randolph-Quinney, P.S. (2006). Death, decay and reconstruction: the archaeology of Ballykilmore Cemetery, County Westmeath. In J. O’Sullivan and M. Stanley (eds.) Settlement, Industry and Ritual: Archaeology. National Roads Authority Monograph Series No. 3. Dublin: NRA/Four Courts Press. pp 113-126.

      • Cherryson, A. K. (2008). Normal, Deviant and Atypical: Burial Variation in Late Saxon Wessex, c. AD 700–1100. In E. M. Murphy (Ed.). Deviant Burial in the Archaeological Record. Oxford: Oxbow Books. pp 115–130.

      • Connolly, M., F. Coyne & L. G. Lynch (2005). Underworld : Death and Burial in Cloghermore Cave, Co. Kerry. Bray, Co. Wicklow: Wordwell.

      • Darwent, C. M. & R. L. Lyman (2002). Detecting  the postburial fragmentation of carpals, tarsals and phalanges. In M. H. Sorg and W. D. Haglund (eds). Advances in Forensic Taphonomy: Method, Theory and Archeological Perspectives. Boca Raton, FL, CRC Press. pp 355-378.

      • d’Errico, F., & Backwell, L. (2016). Earliest evidence of personal ornaments associated with burial: The Conus shells from Border Cave. Journal of Human Evolution, 93, 91–108.

      • De Villiers. H. (1973). Human skeletal remains from Border Cave, Ingwavuma District, KwaZulu, South Africa. Annals of the Transvaal Museum, 28(13), 229–246.

      • Dell’Unto, N. and Landeschi, G. (2022). Archaeological 3D GIS. London: Routledge.

      • Dibble, H. L., Aldeias, V., Goldberg, P., McPherron, S. P., Sandgathe, D., & Steele, T. E. (2015). A critical look at evidence from La Chapelle-aux-Saints supporting an intentional Neandertal burial. Journal of Archaeological Science, 53, 649–657.

      • Dirkmaat, D. C., & Cabo, L. L. (2016). Forensic archaeology and forensic taphonomy: basic considerations on how to properly process and interpret the outdoor forensic scene_. Academic Forensic Pathology_ 6, 439–454.

      • Dirks, P. H., Berger, L. R., Roberts, E. M., Kramers, J. D., Hawks, J., Randolph-Quinney, P. S., Elliott, M., Musiba, C. M., Churchill, S. E., de Ruiter, D. J., Schmid, P., Backwell, L. R., Belyanin, G. A., Boshoff, P., Hunter, K. L., Feuerriegel, E. M., Gurtov, A., Harrison, J. du G., Hunter, R., … Tucker, S. (2015). Geological and taphonomic context for the new hominin species Homo naledi from the Dinaledi Chamber, South Africa. ELife, 4, e09561.

      • Dirks, P.H.G.M., Berger, L.R., Hawks, J., Randolph-Quinney, P.S., Backwell, L.R., and Roberts, E.M. (2016). Comment on “Deliberate body disposal by hominins in the Dinaledi Chamber, Cradle of Humankind, South Africa?” [J. Hum. Evol. 96 (2016) 145-148]. Journal of Human Evolution 96:  149-153.

      • Dirks, P. H., Roberts, E. M., Hilbert-Wolf, H., Kramers, J. D., Hawks, J., Dosseto, A., Duval, M., Elliott, M., Evans, M., Grün, R., Hellstrom, J., Herries, A. I., Joannes-Boyau, R., Makhubela, T. V., Placzek, C. J., Robbins, J., Spandler, C., Wiersma, J., Woodhead, J., & Berger, L. R. (2017). The age of Homo naledi and associated sediments in the Rising Star Cave, South Africa. ELife, 6, e24231.

      • Donnelly, S., C. Donnelly & E. Murphy (1999). The forgotten dead: The cíllíní and disused burial grounds of Ballintoy, County Antrim. Ulster Journal of Archaeology 58, 109-113.

      • Duday, H. (2005). L’archéothanatologie ou l’archéologie de la mort. In: O. Dutour, J.-J. Hublin and B. Vandermeersch (eds) Objets et Méthodes en Paléoanthropologie. Paris: Comité des Travaux Historiques et Scientifiques. pp. 153–215.

      • Duday, H. (2009). Archaeology of the Dead: Lectures in Archaeothanatology. Oxford: Oxbow Books.

      • Finley, N. (2000). Outside of life: Traditions of infant burial in Ireland from cillin to cist.  World Archaeology 31, 407-422.

      • Gargett, R. H. (1999). Middle Palaeolithic burial is not a dead issue: The view from Qafzeh, Saint-Césaire, Kebara, Amud, and Dederiyeh. Journal of Human Evolution, 37(1), 27–90.

      • Goldberg, P., Aldeias, V., Dibble, H., McPherron, S., Sandgathe, D., & Turq, A. (2017). Testing the Roc de Marsal Neandertal “Burial” with Geoarchaeology. Archaeological and Anthropological Sciences, 9(6), 1005–1015.

      • Gómez-Olivencia, A., & García-Martínez, D. (2019). New postcranial remains from the Roc de Marsal Neandertal child. PALEO. Revue d’archéologie Préhistorique, 30–1, 30–1.

      • Green, E.C. (2022). An archaeothanatological approach to the identification of late Anglo-Saxon burials in wooden containers. In C.J. Knüsel and E.M.J. Schotsmans (eds.) The Routledge Handbook of Archaeothanatology. London: Routledge. pp 436-455.

      • Henderson, J. (1987). Factors determining the state of preservation of human remains. In A. Boddington, A. Garland and R. Janaway (eds). Death, Decay and Reconstruction: Approaches to Archaeology and Forensic Science. Manchester: Manchester University Press. pp 43-54.

      • Hunter, J. R. (2014). Human remains recovery: archaeological and forensic perspectives. In C. Smith (ed). Encyclopedia of Global Archaeology. New York: Springer New York. pp 3549-3556.

      • Hochrein, M. (2002). An Autopsy of the Grave: Recognizing, Collecting and Preserving Forensic Geotaphonomic Evidence. In M. H. Sorg and W. D. Haglund (eds). Advances in Forensic Taphonomy: Method, Theory and Archeological Perspectives. Boca Raton, FL, CRC Press: 45-70.

      • Knüsel, C.K. & Robb, J. (2016). Funerary taphonomy: An overview of goals and methods. Journal of Archaeological Science: Reports 10, 655-673.

      • Kuhn, B.F., Berger, L.R. & Skinner, J.D. (2010). Examining criteria for identifying and differentiating fossil faunal assemblages accumulated by hyenas and hominins using extant hyenid accumulations. International Journal of Osteoarchaeology 20, 15-35.

      • Lyman, R. (1994). Vertebrate Taphonomy. Cambridge, Cambridge University Press.

      • Martinón-Torres, M., d’Errico, F., Santos, E., Álvaro Gallo, A., Amano, N., Archer, W., Armitage, S. J., Arsuaga, J. L., Bermúdez de Castro, J. M., Blinkhorn, J., Crowther, A., Douka, K., Dubernet, S., Faulkner, P., Fernández-Colón, P., Kourampas, N., González García, J., Larreina, D., Le Bourdonnec, F.-X., … Petraglia, M. D. (2021). Earliest known human burial in Africa. Nature, 593(7857), 7857.

      • Mickleburgh, H.L & Wescott, D.J. (2018). Controlled experimental observations on joint disarticulation and bone displacement of a human body in an open pit: implications for funerary archaeology. Journal of Archaeological Science: Reports 20: 158-167.

      • Mickleburgh, H.L., Wescott, D.J., Gluschitz, S. & Klinkenberg, V.M. (2022). Exploring the use of actualistic forensic taphonomy in the study of (forensic) archaeological human burials: An actualistic experimental research programme at the Forensic Anthropology Center at Texas State University (FACTS), San Marcos, Texas. In C.J. Knüsel and E.M.J. Schotsmans (eds.) The Routledge Handbook of Archaeothanatology. London: Routledge. pp 542-562.

      • Owsley, D. & B. Compton (1997). Preservation in late 19th Century iron coffin burials. In W. Haglund and M. Sorg (eds). Forensic Taphonomy: The Postmortem Fate of Human Remains. Boca Raton, FL, CRC Press: 511-526.

      • Parker Pearson, M. (1999). The Archaeology of Death and Burial. College Station: Texas A&M University Press.

      • Pettitt, P. (2013). The Palaeolithic Origins of Human Burial. Routledge.

      • Pomeroy, E., Bennett, P., Hunt, C. O., Reynolds, T., Farr, L., Frouin, M., Holman, J., Lane, R., French, C., & Barker, G. (2020). New Neanderthal remains associated with the ‘flower burial’ at Shanidar Cave. Antiquity, 94(373), 11–26.

      • Randolph-Quinney, P.S. (2013). From the cradle to the grave: the bioarchaeology of Clonfad 3 and Ballykilmore 6. In N. Brady, P. Stevens and J. Channing (eds.). Settlement and Community in the Fir Tulach Kingdom. Dublin: National Roads Authority Press. pp A2.1-48.

      • Randolph-Quinney, P.S., Haines, S. and Kruger, A. (2018). The use of three-dimensional scanning and surface capture methods in recording forensic taphonomic traces: issues of technology, visualisation, and validation. In: W.J. M. Groen and P. M. Barone (eds). Multidisciplinary Approaches to Forensic Archaeology. Berlin: Springer International Publishing, pp. 115-130.

      • Rendu, W., Beauval, C., Crevecoeur, I., Bayle, P., Balzeau, A., Bismuth, T., Bourguignon, L., Delfour, G., Faivre, J.-P., Lacrampe-Cuyaubère, F., Tavormina, C., Todisco, D., Turq, A., & Maureille, B. (2014). Evidence supporting an intentional Neandertal burial at La Chapelle-aux-Saints. Proceedings of the National Academy of Sciences, 111(1), 81–86.

      • Sandgathe, D. M., Dibble, H. L., Goldberg, P., & McPherron, S. P. (2011). The Roc de Marsal Neandertal child: A reassessment of its status as a deliberate burial. Journal of Human Evolution, 61(3), 243–253.

      • Silver, M. (2016). Conservation Techniques in Cultural Heritage. In E. Stylianidis and F. Remondino (eds) 3D Recording, Documentation and Management of Cultural Heritage. Dunbeath: Whittles Publishing. pp 15-106.

      • Schotsmans, E.M.J., Georges-Zimmermann, P., Ueland, M. and Dent, B.B. (2022). From flesh to bone: Building bridges between taphonomy, archaeothanatology and forensic science for a better understanding of mortuary practices. In C.J. Knüsel and E.M.J. Schotsmans (eds.) The Routledge Handbook of Archaeothanatology. London: Routledge. pp 501-541.

    1. eLife assessment

      This paper presents important information about potentially Homo naledi-associated markings discovered on the walls of the Hill Antechamber of the Rising Star Cave system, South Africa. If confirmed, the antiquity, intentionality, and authorship of the reported markings will have profound archaeological implications, as such behaviors are otherwise widely considered to be unique to our species, Homo sapiens. As it stands, the study is incomplete, and the evidence presented does not support the claims about the anthropogenic nature, age, and author of the engravings. While it is appreciated that this report concerns preliminary findings, all reviewers agree that: a) the initial nature of the reported results must be more clearly indicated, b) the anthropogenic nature of the engravings must be adequately demonstrated, c) ideally the chronology of the claimed engravings has to be established for any age estimate to be reliable, and d) the claim about H. naledi being the author of the reported engravings requires robust association.

    2. Reviewer #1 (Public Review):

      I think it is important to note up front that I recognize that the goal of this paper was to announce the discovery of what appear to be intentionally-made marks in Rising Star cave in South Africa. This was not meant to be an in-depth analysis or a declaration of definitive results. With this in mind, I appreciate that the authors did not try to overstate this new discovery, but instead simply reported what had been observed, provided a little bit of background on the current state of the field in regards to the evolution of hominin visual mark-making, made a few tentative identifications, but then clearly acknowledged that a lot more documentation, sampling, and study would be needed before we could understand the full scope and potential importance of this find.

      This is a big claim. If it proves to be true, it has the potential to be paradigm-shifting as the identification of intentional engraved marks, made by a small-brained distant human cousin 200,000+ years ago in South Africa, would completely change our understanding of where, when and who made the first graphic marks. Twenty years ago, this claim would probably have been dismissed out of hand as being too far-fetched to be taken seriously, but there have been some major shifts in the field in recent years, in regard to the age of the art and the identity of the artists, that means this is a claim that should be approached with a scientifically cautious, but open mind. There is now mounting evidence for Neanderthals, and potentially other closely related species as well, to have been engaging in similar art-making practices to our own Homo sapiens ancestors. What makes this particular claim so extraordinary is that these marks are some of the oldest in the world and that Homo naledi is a more distant relation with a smaller brain. This is also what makes the further study of this discovery such a fascinating exercise in scientific inquiry.

      From a technical and methodological perspective, there is an excellent range of tools and technologies that can be used to study these engravings, so I have no doubt that further studies will help answer some of the "nuts and bolts" questions. Then there is also the opportunity created by this discovery to really open a broader dialogue in the field about who were the first artists and at what point does the hominin brain become "primed" for making visual marks. I look forward to all sorts of lively debates in the future and to seeing the results of further in-depth studies.

    3. Reviewer #2 (Public Review):

      Patterns scored into or painted on durable media have long been considered important markers of the cognitive capabilities of hominins. More specifically, the association of such markers with Homo sapiens has been used to argue that our evolutionary success was in part shaped by our unique ability to code, store and convey information through abstract conventions.

      That singularity of association has been cast into doubt in the last decade with finds of designs apparently painted or carved by Neanderthals, and potentially by even earlier hominins. Even allowing for these developments, however, extending the capability to generate putatively abstract designs to a relatively small-brained hominin like Homo naledi is contentious. The evidential bar for such claims is necessarily high, and I don't believe that it has been cleared here.

      The central issue is that the engravings themselves are not dated. As the authors themselves note, the minimum age constraint provided by U/Th on flowstone does not necessarily relate to the last occupation of the Dinaledi cave system, as the earlier ESR age on teeth does not necessarily document first use of the cave. The authors state that "At present we have no evidence limiting the time period across which H. naledi was active in the cave system". On those grounds though, assigning the age range of presently dated material within the cave system to the engravings - as the current title unambiguously does - is not justifiable.

      Because we don't know when they were made, the association between the engravings and Homo naledi rests on the assertion that no humans entered and made alterations to the cave system between its last occupation by Homo naledi, and its recent scientific recording. This is argued on page 6 with the statement that "No physical or cultural evidence of any other hominin population occurs within this part of the cave system".

      There is an important contrast between the quotes I have referred to in the last two paragraphs. In the earlier quote, the absence of evidence for Homo naledi in the cave system >335 ka and <241 ka is not considered evidence for their absence before or after these ages. Just because we have no evidence that Homo naledi was in the cave at 200 ka doesn't mean they weren't there, which is an argument I think most archaeologists would accept. When it comes to other kinds of humans, though - per the latter quote - the opposite approach is taken. Specifically, the present lack of physical evidence of more recent humans in the cave is considered evidence that no such humans visited the cave until its exploration by cavers 40 years ago. I don't think many archaeologists would consider that argument compelling. I can see why the authors would be drawn to make that assertion, but an absence of evidence cannot be used to argue in one way for use of the cave by Homo naledi and in another way for use of the cave by all other humans.

      A second problem is with what Homo naledi might have made engravings. The authors state that "The lines appear to have been made by repeatedly and carefully passing a pointed or sharp lithic fragment or tool into the grooves". The authors then describe one rock with superficial similarities to a flake from the more recent site of Blombos to suggest that sharp-edge stones with which to make the engravings were available to Homo naledi. Blombos is considered relevant here presumably because it has evidence for Middle Stone Age engravings. The authors do not, however, demonstrate any usewear on that stone object such as might be expected if it was used to carve dolomite. Given that it is presented as the only such find in the cave system so far, this seems important.

      My greater concern is that the authors did not compare the profile morphology of the Dinaledi engravings with the extensive literature on the morphology of scored lines caused by sharp-edge stone implements (e.g., Braun et al. 2016, Pante et al. 2017). I appreciate that the research group is reticent to undertake any invasive work until necessary, but non-destructive techniques could have been used to produce profiles with which to test the proposition that the engravings were made with a sharp edge stone.

      One thing I noticed in this respect is that the engravings seem very wide, both in absolute terms and relative to their depth. The data I collected from the Middle Stone Age engraved ochre from Klein Kliphuis suggested average line widths typically around 0.1-0.2 mm (Mackay and Welz 2008). The engraved lines at Dinaledi appear to be much wider, perhaps 2-5 mm. This doesn't discount the possibility that the engravings in the Dinaledi system were carved with a sharp edge stone - the range of outcomes for such engravings in soft rock can be quite variable (Hodgskiss 2010) - only that detailed analysis should precede rather than follow any assertion about their mode of formation.

      None of this is to say that the arguments mounted here are wrong. It should be considered possible that Homo naledi made the engravings in the Dinaledi cave system. The problem is that other explanations are not precluded.

      As an example, the western end of the Dinaledi subsystem has a particular geometry to the intersection of its passages, with three dominant orientations, one vertical (which is to say, north-south), and two diagonal (Figure 1). The major lines on Panel A have one repeated vertical orientation and two repeated diagonal orientations (Figure 16), particularly in the upper area not impacted by stromatolites. The lines in both the cave system and engravings in Panel A appear to intersect at similar angles. Several of the cave features appear, superficially at least, to be replicated. In fact, scaled, rotated, and super-imposed, Figure 16 is a plausible 'mud map' of the western end of the Dinaledi system carved incrementally by people exploring the caves. A figure showing this is included here:

      Of course, there are problems with this suggestion. The choice of the upper part of Panel A is selective, the similarity is superficial, and the scales are not necessarily comparable. (Note, btw, that all of those caveats hold equally well for the comparison the authors make between the unmodified rock from Dinaledi and the flake from Blombos in Figure 19). However, the point is that such a 'mud map hypothesis' is, as with the arguments mounted in this paper, both plausible and hard to prove.

      Having read this paper a few times, I am intrigued by the engravings in the Dinaledi system and look forward to learning more about them as this research unfolds. Based on the evidence presently available, however, I feel that we have no robust grounds for asserting when these engravings were made, by whom they were made, or for what reason they were made.

      References:

      • Braun, D. R., et al. (2016). "Cut marks on bone surfaces: influences on variation in the form of traces of ancient behaviour." Interface Focus 6: 20160006.

      • Hodgskiss, T. (2010). "Identifying grinding, scoring and rubbing use-wear on experimental ochre pieces." Journal of Archaeological Science 37: 3344-3358.

      • Mackay, A. & A. Welz (2008). "Engraved ochre from a Middle Stone Age context at Klein Kliphuis in the Western Cape of South Africa." Journal of Archaeological Science 35: 1521-1532.

      • Pante, M. C., et al. (2017). "A new high-resolution 3-D quantitative method for identifying bone surface modifications with implications for the Early Stone Age archaeological record." J Hum Evol 102: 1-11.

    4. Reviewer #3 (Public Review):

      Lee Berger and colleagues argue here that markings they have found in a dark isolated space in the Rising Star Cave system are likely over a quarter of a million years old and were made intentionally by Homo naledi, whose remains nearby they have previously reported. As in a European and much later case they reference ('Neanderthal engraved 'art' from the Pyrenees'), the entangled issues of demonstrable intentionality, persuasive age and likely authorship will generate much debate among the academic community of rock art specialists. The title of the paper and the reference to 'intentional designs', however, leave no room for doubt as to where the authors stand, despite avoidance of the word art, entering a very disputed terrain. Iain Davidson's (2020) 'Marks, pictures and art: their contributions to revolutions in communication', also referenced here, forms a useful and clearly articulated evolutionary framework for this debate. The key questions are: 'are the markings artefactual or natural?', 'how old are they?' and 'who made them?, questions often intertwined and here, as in the Pyrenees, completely inseparable. I do not think that these questions are definitively answered in this paper and I guess from the language used by the authors (may, might, seem etc) that they do not think so either.

      First, a few referencing issues: the key reference quoted for distinguishing natural from artefactual markings (Fernandez-Jalvo et al. 2014), whilst mentioned in the text, is not included in the references. In the acknowledgements, the claim that "permits to conduct research in the Rising Star Cave system are provided by the South African National Research Foundation" should perhaps refer rather to SAHRA? In the primary description of their own markings from Rising Star and their presumed significance, there are, oddly, several unacknowledged quotes from the abstract of one of the most significant European references (Rodriguez-Vidal et al. 2014). These need attention.

      Before considering the specific arguments of the authors to justify the claims of the title, we should recognise the shift in the academic climate of those concerned with 'ancient markings' that has taken place over the past two or three decades. Before those changes, most specialists would probably have expected all early intentional markings to have been made by Homo sapiens after the African diaspora as part of the explosion of innovative behaviours thought to characterise the 'origins of modern humans'. Now, claims for earlier manifestations of such innovations from a wider geographic range are more favourably received, albeit often fiercely challenged as the case for Pyrenean Neanderthal 'art' shows (White et al. 2020). This change in intellectual thinking does not, however, alter the strict requirements for a successful assertion of earlier intentionality by non-sapiens species. We should also note that stone, despite its ubiquity in early human evolutionary contexts, is a recalcitrant material not easily directly dated whether in the form of walling, artefact manufacture or potentially meaningful markings. The stakes are high but the demands are no less so.

      Why are the markings not natural? Berger and co-authors seem to find support for the artefactual nature of the markings in their location along a passage connecting chambers in the underground Rising Star Cave system. The presumption is that the hominins passed by the marked panel frequently. I recognise the thinking but the argument is weak. More confidently they note that "In previous work researchers have noted the limited depth of artificial lines, their manufacture from multiple parallel striations, and their association into clear arrangement or pattern as evidence of hominin manufacture (Fernandez-Jalvo et al. 2014)". The markings in the Rising Star Cave are said to be shallow, made by repeated grooving with a pointed stone tool that has left striations within the grooves and to form designs that are "geometric expressions" including crosshatching and cruciform shapes. "Composition and ordering" are said to be detectable in the set of grooved markings. Readers of this and their texts will no doubt have various opinions about these matters, mostly related to rather poorly defined or quantified terminology. I reserve judgement, but would draw little comfort from the similarities among equally unconvincing examples of early, especially very early, 'designs'. Two or even three half-convincing arguments do not add up to one convincing one.

      The authors draw our attention to one very interesting issue: given the extensive grooving into the dolomite bedrock by sharp stone objects, where are these objects? Only one potential 'lithic artefact' is reported, a "tool-shaped rock [that] does resemble tools from other contexts of more recent age in southern Africa, such as a silcrete tool with abstract ochre designs on it that was recovered from Blombos Cave (Henshilwood et al. 2018)", also figured by Berger and colleagues. A number of problems derive from this comparison. First, 'tool-shaped rock' is surely a meaningless term: in a modern toolshed 'tool-shaped' would surely need to be refined into 'saw-shaped', 'hammer-shaped' or 'chisel-shaped' to convey meaning? The authors here seem to mean that the Rising Star Cave object is shaped like the Blombos painted stone fragment. But the latter is a painted fragment, not a tool and so any formal similarity is surely superficial and offers no support to the 'tool-ness' of the Rising Star Cave object. Does this mean that Homo naledi took (several?) pointed stone tools down the dark passageways, used them extensively and, whether worn out or still usable, took them all out again when they left? Not impossible, of course. And the lighting?

      The authors rightly note that the circumstance of the markings "makes it challenging to assess whether the engravings are contemporary with the Homo naledi burial evidence from only a few metres away" and more pertinently, whether the hominins did the markings. Despite this honest admission, they are prepared to hypothesise that the hominin marked, without, it seems, any convincing evidence. If archaeologists took juxtaposition to demonstrate authorship, there would be any number of unlikely claims for the authorship of rock paintings or even stone tools. The idea that there were no entries into this Cave system between the Homo naledi individuals and the last two decades is an assertion, not an observation, and the relationship between hominins and designs no less so. In fact, the only 'evidence' for the age of the markings is given by the age of the Homo naledi remains, as no attempt at the, admittedly very difficult, perhaps impossible, task of geochronological assessment, has been made.

      The claims relating to artificiality, age and authorship made here seem entangled, premature and speculative. Whilst there is no evidence to refute them, there isn't convincing evidence to confirm them.

      References:

      • Davidson, I. 2020. Marks, pictures and art: their contribution to revolutions in communication. Journal of Archaeological Method and Theory 27: 3 745-770.

      • Henshilwood, C.S. et al. 2018. An abstract drawing from the 73,000-year-old levels at Blombos Cave, South Africa. Nature 562: 115-118.

      • Rodriguez-Vidal, J. et al. 2014. A rock engraving made by Neanderthals in Gibralter. Proceedings of the National Academy of Sciences.

      • White, Randall et al. 2020. Still no archaeological evidence that Neanderthals created Iberian cave art.

    5. Reviewer #4 (Public Review):

      This is potentially a landmark study with far-reaching consequences for archaeology, palaeoanthropology, and more widely. The antiquity of intentional human mark marking is a hot topic but this study – understood as initial – has as yet incomplete sources of evidence and methods; and it will be interesting to follow how the study develops in subsequent studies.

      Strengths and points to build on:

      * Heuristic potential: As knowledge advances it poses a risk to accepted knowledge – and we should accept that one such risk is moving on from long-held disciplinary tenets. In this case, there has been a growing quantum of evidence – all hotly debated – for the deep antiquity of mark-making and even symbolism by species other than ourselves. Most researchers now accept Neanderthal symbolic capacity actualised in burials, intentional mark-making and the like. The evidence here presented is not unequivocal but is very suggestive and an ideal test case for applying multi-disciplinary techniques of analysis and interpretation beyond the expertise of the listed authors *see comments in 'weaknesses'). This work by itself may be equivocal but when taken together with other such work, points to a 'human' sensu lato past that is as complex as it is long. This work then helps all researchers to at least be alive to the possibility of things like anthropic marks and residues in a context not normally thought to have it.

      * Decentering speciesism: As per the above comment, I appreciate empirical studies that erode speciesism – in particular studies that open up our minds to the possibility that multiple members of the Genus Homo were capable of intentional mark-making and even 'symbolic' behaviour, though this latter term is not well understood or uniformly used. This is probably because of continuous unconscious bias on our part as currently the only exemplar of our genus living - in contrast to most of the past in which different species and genera co-existed - if not on the same landscape and/or at exactly the same time, then with enough overlap that people would have realised 'others' were about either by sight and/or by encountering their physical remains and artefacts.

      * Problematising 'firsts' and deep time: A strength – but which needs to be developed in this manuscript – is our understanding of time and change. We have a plethora of dating techniques but relatively few substantive monographs, articles, and think tanks on time – and especially on how change comes about and what causes it. This leads us to privilege 'firsts' and the 'oldest' finds in 'deep' time above those that are more recent and in 'shallow' time. I would suggest in addition to the claims for the oldest of the reported marks, the authors develop nascent remarks on the possibility the suite of marks may have been made over time. This will help counter criticism that these marks – if established to be anthropic – were not just a singularity, but part of patterned behaviour, which would move it towards the realm of 'symbolic' cognitive behaviour. And indeed, it would be good to hear more about why in this place, these marks were made to establish a replicable model for identifying early anthropic marks.

      Ultimately, this manuscript presents evidence that those who are pro the deep antiquity of intentional mark-making by Homo (and possibly even other genera) will find enough evidence to support; while those sceptical of such claims will find enough methodological flaws and evidential limits to refute those claims. The next decade of work will likely be definitive and this article makes a key contribution to the debate.

      Weaknesses and points to attend to:

      * Definitions: The term 'rock engraving' is used rather uncritically and also the term 'etching' – and it would be useful to have a short definition of how the authors understand the term. Rock art scholars regularly debate these terms and whether they are or are not 'rock art' with its overwhelmingly visual bias; which this discovery may usefully help overthrow and advance.

      * Dating: There is no evidence provided for dating the marks found in the cave system. They could, for example, have been made more recently than the dates claimed – and by another species (if we accept their anthropogenic authorship). This is a perennial problem of much rock art research – especially when it comes to understanding the wider archaeological/palaeoanthropological context. More crucially, accurate dating allows a more reliable understanding of authorship and who/what was responsible for a particular artefact or feature. This has not been demonstrated in this case, though we do have fossil evidence of Homo naledi in the cave system. The article title is this incorrect / and unsupported claim as the marks, if they are anthropic, have not been dated and are of unknown age. The authors allow that there may have been multiple episodes, but not that the marks can belong to a time other than they posit – either earlier, later, or distributed over a long period as the authors allow for in their concluding remarks.

      * Authorship: The study does not utilise either a geoscientist as one of the authorial team, or a rock art specialist. These are key oversights as the former would help better contextualise the dating of the marks reported on, as well as explore alternative non-anthropogenic agents that may have created the marks reported on. For example, the marks and 'pitting' etc may be the result of water bringing abrasive agents during times of flooding, hitting prominent rock features in the cave system. Some explanation is given from lines 114-124, but are uncited. The overlying 'sediment' may be similar to the mondmilch found in cave systems and which is of natural origin. It may be that these non-anthropogenic causes are easy to discount; but the arguments do need to be made. Or, that the polishing was made by Homo naledi brushing against the surfaces as they moved in the cave system, independent of any mark-making. A Table showing the pros and cons of intentional anthropic versus natural authorship would be very effective - as well as showing some of the natural linear marks in the cave system to avoid any confirmation or similar bias. FTIR analysis of the panel A-C would be more than useful to determine whether an additional layer of material has been added. This is mentioned for future work, but this seems a rather post-hoc research programme.

      * Use-wear analysis: If the marks are anthropic in origin; they are likely to have been made by a stone tool, which would leave characteristic marks, directionality and sequencing, distinct from natural causes. It is vital this work – such as was done on the Blombos engraved ochre – is done here – for example, linking to the chert and other tools described on lines 152-158. Note Figure 19, of such a tool, is very hard to make out. The Blombos – and Klasies River Mouth engraved ochres (curiously not referenced) – have very similar geometric markings and there is a real opportunity to compare these in securely dated contexts of 70-120 kya –which could support the argument made here for Homo naledi's cognitive capacity. On figure 16 it would be good to know on what basis some marks were selected as anthropic – and why others were not; this would help demonstrate the methodology and ability to distinguish between the two kinds of marks.

      * Viewshed: The rock art specialist would have added essential expertise on how to study anthropic marks. For example, the images of the marks shown are all of individual or small collections of motifs rather than showing each panel as well as all panels together, to help understand the iconographic context as an ensemble – a 'feature' rather than isolated 'artefacts' or 'motifs'. Line 60 mentions being able to see these as a 'triptych' but the reader is not able to have this view in this manuscript. From the cave map, it is not clear whether all three 'panels' (an unfortunate art historical term that suggests a framed entity - better to use a term like 'cluster') can be viewed simultaneously or in sequence. The view shed in relation to the area where the bodies were recovered is vaguely stated as 'only a few metres away' and is worth developing. I understand 3D scans have been made so it would be useful to have a version showing the marks in relation to where the bodies were recovered and as a 3-cluster ensemble.

      * Image enhancements: Also, in addition to polarised images, have colour enhancement tools like DStretch been tried to see if, for example, attempts at colouring with different coloured sands were made? Similarly, a 3D scan of the motif and panel – (Metashape is mentioned but not shown) – might assist in understanding how the marks and the rock they are on might relate to each other- as research in European upper Palaeolithic contexts has shown. Here, experimenting with different kinds of lighting - or in the absence of lighting, of tactility and how these marks and their rock support may have been experienced by those who may have made and interacted with them? As a note, it would be useful to have a scale in each image of the 'engravings' and it is a pity the one in situ photograph with the scale is not a standard rock art colour-corrected scale as is commonly used in rock art research.

    6. Author Response:

      We would like to thank the eLife reviewers for the considerable time and effort they have invested to review these manuscripts. We have also benefited from a previous round of review of the manuscript describing the proposed burial features, which underwent two rounds of revisions in a high-impact journal over a period of approximately 8 months during 2022 and early 2023. Both sets of reviews have reflected mixed responses to the evidence we have presented, with one reviewer recommending acceptance with minor editorial revisions, two recommending acceptance with minor revisions and the fourth recommending rejection based upon similar arguments to those reflected by some of the reviewers in this current round of reviews in eLife. Ultimately the managing editor of this first journal took the decision that the review process could not be completed in a timely manner and rejected the manuscript although the submission here reflected our consideration of these reviewers suggestions.

      We have chosen in this initial response to the eLife reviews to include some references to the previous anonymous reviews in order to illustrate differences of opinion and differences in revision suggestions within the review process. Our goal is to offer maximal insight into our decision-making process and to acknowledge the considerable time and effort put into the assessment of these manuscripts by reviewers (for eLife and in the case of the earlier review process). We hope that this approach will assist the readers, and reviewers, of our manuscripts in understanding why we are proceeding with certain decisions during the revision process.

      This is a new process for us and the reviewers, and one way in which it significantly differs from more traditional review is that both the reviews and our reply will be public well in advance of our revisions to the manuscript. Indeed, considering the scope of the reviews, some of those revisions may take considerable time, although many can be accomplished fairly easily. Thus, we are not in a position to say that we have solved every issue raised by the reviewers. Instead, we will examine what appear to be the key critical issues raised regarding the data and the analyses and how we propose to address these as we revise the papers. We will also address several philosophical and ethical issues raised by the reviews and our proposal for dealing with these. More specific editorial and citational recommendations will be dealt with on a case-by-case basis, and we do not address these point-by-point in this reply. Please note, this response to the reviewers is not the revision of the manuscript and is only the initial opinion of the corresponding authors with some guidance from the larger group of authors of all three papers. Our final submitted revision will reflect the input of all authors included on those submissions.

      We took the decision to submit three separate papers consciously. The two different categories of evidence, burials and engravings, involve different kinds of analysis and different (although overlapping) teams of researchers, and we recognized that each deserved their own presentation and assessment. Meanwhile, together they inform the context of H. naledi in a way that requires some synthetic discussion, in which both kinds of evidence are relevant, leading to a third paper. But the mutual relevance of these different kinds of evidence and their review by a common set of reviewers naturally raises cross-cutting issues, and the reviewers have cross-referenced the three articles. This has sometimes led to suggestions about one manuscript based on the contents of another. Considering the situation, we accepted the recommendation that it would be clearer to consider all three articles in a single reply. Thus, while each of the three papers will proceed separately during the revision process, it will be necessary to highlight across all three papers occasionally in our responses.

      Scientific Issues:

      In reading the reviews, we feel there are 9 critical points/assertions raised by one or more of the reviewers that present a problem for, or challenge to, our hypothesis that the observed evidence (bone accumulations and engravings) described in the Dinaledi subsystem are of intentional naledigenic origin. These are:

      1. The evidence presented does not demonstrate a clear interruption of the floor sediments, thus failing to demonstrate excavated holes.

      2. The sediments infilling the holes where the skeletal remains are found have not been demonstrated to originate from the disruption of the floor sediments and thus could be part of a natural geological process (e.g. water movement, slumping) or carnivore accumulations.

      3. Previous geological interpretations by our research group have given alternative geological explanations for formation of the bony accumulations that contradict the present evidence presented here and result in alternative origins hypotheses.

      4. Burial cannot be effectively assessed without complete excavation of the features and site.

      5. The skeletal remains as presented do not conform clearly to typical body arrangement/positions associated with human (Homo sapiens) burials.

      6. There is no evidence of grave goods or lithic scatters that are typically associated with human burials.

      7. Humans may have been involved with the creation of either the Homo naledi bone accumulations, the engravings, or both.

      8. Without a date of the engravings, the null hypothesis should be the engravings were created by Homo sapiens.

      9. The null hypothesis for explanation of the skeletal remains in this situation should be “natural accumulation”.

      Our analysis of the Dinaledi Feature 1 leads us to accept that the laminated orange-red mudstone (LORM) sedimentary layer is interrupted, indicating a non-natural intervention, and that the hole created by the interruption was then filled by both a fleshed body (and perhaps parts of other bodies) which were then covered by sediment that originated from the hole that was dug. We recognize that the four eLife reviewers are not convinced that our presentation is sufficient to establish this. Interestingly, this was not the universal opinion of earlier reviewers of the initial manuscript several of whom felt we had adequately supported this hypothesis. The lack of clarity in this current version of the burial manuscript is our responsibility. In the upcoming revision of this paper to be submitted, we will take the reviewers’ critiques to heart and add additional figures that illustrate better the disruption of the LORM and clarify the sedimentological data showing the material covering the skeletal remains in the hole are the disrupted sediments excavated from the same hole. We are proposing to isolate this most critical evidence for burial into a separate section in the revised submission based on the reviewers’ comments. The fact that the LORM layer is disrupted, a fleshed body was placed in the hole created by this disruption, and the body (and perhaps parts of other bodies) was/were then covered by the same sediments from the hole is the central feature of our hypothesis that the bone accumulations observed reflect a burial and not a natural process.

      The possibility of fluvial transport or involvement in the subsystem is a topic that we have addressed extensively in past work, and it is clear from these reviews that we must enhance our current manuscript to discuss this issue at greater length. Our previous work (Dirks et al. 2015; Dirks et al. 2017) emphasized that fluvial transport of whole bodies into the subsystem was precluded by several lines of sedimentological evidence. We excavated a rich accumulation of skeletal remains, including articulated limbs and other elements in subvertical orientations inconsistent with slow sedimentary infill, which were difficult to explain without positing either a large and dense pile of bodies and/or sediment movement. We encountered fractured chunks of laminated orange-red mudstone (LORM) in random orientations within our excavation area, within and among skeletal remains, which directly refuted that the remains were inundated with water at the time of burial, and this limited the possibility of fluvial transport. Water flow sufficient to displace bodies or complete skeletal evidence would also transport large and course sediment, which is absent from the subsystem, and would sort the commingled skeletal material that we found by size, which we do not observe. But our excavation only covered less than a square meter at very limited depth, and this was the limit to our knowledge of subsurface sediment. We thus were left with uncertainty that led us to suggest the possibility of sediment slumping or movement into subsurface drains, although these were not observed near our excavation. Our current work expands our knowledge of the subsurface and presents an alternative explanation for the disposition of skeletal remains from our earlier excavation. But we acknowledge that this new explanation is vulnerable to our own previous published proposals, and we must do a better job of explaining how the new information addresses our previous suggestions. By not clearly creating a section where we explained how these previous hypotheses were now nullified by new evidence, we clearly confused the reviewers with our own previous work. We will revise the manuscript by enhancing the review of the significant geological evidence demonstrating that there is no significant fluvial action in the system and making it clear how the burial hypothesis provides a clearer explanation for the situation of skeletal remains from our previous excavation work.

      One of the central issues raised by reviewers has been a perceived need to excavate these features completely, totally exhuming all skeletal remains from them. Reviewers have written that it is necessary to identify every skeletal element that is present and account for any missing elements. On this point, we have both ethical and scientific differences from these reviewers. We express our ethical concerns first. Many of the best-preserved possible burials ever discovered by archaeologists were subjected to total excavation and exhumation. Cases like La Chapelle-aux-Saints, La Ferrassie, and Skhūl were fully excavated at a time when data recording and excavation methods did not include the range of spatial and geomorphological approaches that later became routine. The judgment of early investigators that these situations were intentional burials was challenged by later workers, and the kind of information that might enable better tests had been irrevocably lost (Gargett 1999; Dibble et al. 2015; Rendu et al. 2014).

      Later, improved excavation standards have not sufficed to remove uncertainty or debate about possible burials. For example, it was long presumed that well-preserved remains of young children were by themselves diagnostic of intentional burial, such as those from Dederiyeh, Border Cave, or Roc de Marsal. Such cases were also fully excavated, with adequate documentation of the positioning of skeletal remains and their surrounding stratigraphic situation, but such cases were later challenged on several bases and the complete exhumation of material has confused or precluded testing of new hypotheses (e.g. Gargett 1999). The case of Roc de Marsal is one in which data from the initial excavation combined with data from the initial excavation combined with re-excavation and geoarchaeological analysis led to a naturalistic interpretation of the skeletal material (Sandgathe et al. 2011; Goldberg et al. 2017). But even in this case, the researchers erred in their interpretation of the skeleton’s situation due to a lack of identification of parts of the infant’s skeleton (Gómez-Olivencia and García-Martinez 2019). That is to say, it is not only the burial hypothesis but other hypotheses that suffer from complete excavation. Researchers concerned with preserving all possible information have sometimes taken extraordinary measures to remove and study possible burials at high-resolution in the laboratory. Such was the case of the Shanidar IV burial removed from the site and transported in plaster jacket by Solecki, which led to the disruption and loss of internal stratigraphic information (Pomeroy et al. 2020). Arguably, the current state of the art is full excavation with partial preparation, such as that undertaken at Panga ya Saidi (Martinón-Torres et al. 2021). But again, any future attempt to reinterpret or test the hypothesis of burial must rely on the adequacy of documentation as the original context has been removed.

      In our decision to leave material in place as much as possible, we are expanding upon standard practice to leave witness sections and unexcavated areas for future research. The situation is novel, representing possible burials by a nonhuman species, and that makes it doubly important in our opinion to be conservative in not fully exhuming the skeletal material from its context. We anticipate that many other researchers, including future investigators, will suggest additional methods to further test the hypothesis of burial, something that would be impossible if we had excavated the features in their entirety prior to publishing a description of our work. We believe strongly that our ethical responsibility is to publish the work and the most likely interpretation while leaving as much evidence in place as possible to enable further testing and replication. We welcome the suggestions of additional methods/analyses to test the H. naledi burial hypothesis.

      This being said, we also observe that total exhumation would not resolve the concerns raised by the reviewers. The recommendation of total exhumation is in pursuit of a full account of all skeletal material present and its preservation and spatial situation, in order to demonstrate that they conform to body positions comparable to human burials. As has been highlighted in forensic casework, the excavation of an inhumation feature does not necessarily provide an accurate spatial or anatomical manifest of the stratigraphical relationships between the body, encapsulating matrix, and any cut present due to preservational, taphonomic and operational factors (Dirkmaat and Cabo, 2016; Hunter, 2014). In particular, in cases where skeletal elements are highly fragmented, friable, or degraded (such as through bioerosion) then complete excavation—even under controlled laboratory conditions—may destroy bone and severely limit skeletal identification (Henderson, 1997; Hochrein, 2002; Owsley and Compton, 1997), particularly in elements where the ratio of trabecular to cortical bone is high (Darwent and Lyman, 2002; Lyman, 1994). As such, non-invasive methods of 3D and 4D modelling (preservation in situ) are often considered preferable to complete necropsy or excavation (preservation by record) where appropriate (Bolliger and Thali, 2009; Dell’Unto and Landeschi, 2022; Randolph-Quinney et al., 2018; Silver, 2016). 

      The test of burial is not primarily positional, but taphonomic and geological. The position and number of bones can elaborate on process-driven questions of decay and destruction in the burial environment, or post-mortem modification, but are not singularly indicative of whether the remains were intentionally buried – the post-mortem narrative of all the processes affecting the cadaveric island is required (Knüsel and Robb, 2016). In previous cases, researchers have disputed or accepted the hypothesis of intentional hominin burial based upon assumptions about how modern humans or Neandertals would have positioned bodies, with the idea that some positions reflect ritual intent while others do not. But applying such assumptions is unjustifiable, particularly for a species like H. naledi, whose culture may have differed fundamentally from our own. Our work acknowledges that the present evidence does not enable a full reconstruction of the burial positions, but it does show that fleshed remains were encased in sediment prior to decomposition of soft tissue, and that subsequent spatial changes can be most parsimoniously explained by natural decomposition within sedimentary matrix contained within a burial feature (after Green, 2022; Mickleburgh and Wescott, 2018; Mickleburgh et al., 2022). If the argument is that extraordinary claims require extraordinary evidence, we feel that the evidence documents excavation and interment (and will do so more clearly in the revision) and the fact of the remains do not match a “typical” human burial in body positioning is not in itself evidence that these are not H. naledi burials.

      We feel that the reviewers (in keeping with many palaeoanthropologists) have a clear idea of what they “think” a burial should look like in an idealised sense, but this platonic ideal of burial form is not matched by the extensive literature in archaeothanatology, funerary archaeology and forensic science which indicates enormous variability in the activity, morphology and post-mortem system experienced by the human body in cases of interment and body disposal (e.g. Aspöck, 2008; Boulestin and Duday, 2005 and 2006; Connelly et al., 2005; Channing and Randolph-Quinney, 2006; Cherryson, 2008; Donnelly et al., 1995; Finley, 2000; Hunter, 2014; Parker Pearson, 1999; Randolph-Quinney, 2013). Decades of experience in the identification, recovery and interpretation of clandestine, deviant, and non-formal burials indicates the platonic ideal is rare, and in many contexts, the exception (Cherryson, 2008; Parker Pearson, 1999). This variability is particularly relevant to morphological traits in burial context, such as the informal nature of the grave cut in plan and section, shallow burial depth, and initial disposition of body (placement) during the early post-mortem period. These might run counter to the expectations of reviewers or others referencing the fossil hominin record, but are well accepted within the communities of researchers investigating Holocene archaeological sites and forensic contexts.

      It is encouraging to see reviewers beginning to incorporate the extensive (often experimentally derived) literature from archaeothanatology and forensic taphonomy in their deliberations, and we will be taking these comments on board going forward. In particular, we acknowledge reviewers’ comments and the need to construct a more detailed post-mortem narrative, accounting for joint disarticulation (labile versus persistent joints etc), displacement, and final disposition of elements within the burial space. As such we will incorporate the hierarchy of decomposition (rank order disarticulation), associations between regions of anatomical association, areas of disassociation, and the voids produced during decomposition (after Mickleburgh and Wescott, 2018; Mickleburgh et al., 2022) into our narrative. In doing so we acknowledge the tensions between the inductive archaeolothanatological narrative-driven approach (e.g. Duday, 2005 & 2009) versus robust decomposition data derived from human forensic taphonomic experimentation recently articulated by Schotsmans and colleagues (2022) - noting that we will highlight comparative data based on forensic experimental casework and actualistic modelling over inductive intuitive approaches which come with significant evidential shortcomings (Bristow et al. 2011).

      Finally, from a taphonomic perspective it is worth pointing out to reviewers that we have already addressed the issue of lack of taphonomic evidence for carnivore involvement in the formation of the Dinaledi assemblage (Dirks, et al., 2016). Absence of any carnivore-induced bone surface modifications, patterns of skeletal part representation, and a total absence of any carnivore remains found within the Dinaledi chamber (following Kuhn and colleagues, 2010) lead us to reject carnivores as possible vectors of body accumulation within the Dinaledi Chamber and Hill Antechamber.

      Reviewers suggest that without a date derived from geochronological methods, the engravings cannot be associated with H. naledi, and that it is possible (or probable) that the engravings were done in the recent past by H. sapiens. This suggestion neglects the context of the site. We have previously documented the structure and extremely limited accessibility of the Dinaledi subsystem. This subsystem was not recorded on maps of the documented Rising Star Cave system prior to our work and its discovery by our teams. Furthermore, there is no evidence of prehistoric human activity in the areas of the cave related to possible subterranean entrances There is no evidence that humans in the past typically ventured into such extreme spaces like those of Rising Star. It is clear from the presence of the remains of many individuals that H. naledi ventured into these spaces again and again. It is likely that H. naledi moved through these spaces more easily than humans do based on their physique. We show that the engravings overlay each other suggesting multiple engraving events.  These engravings took time and effort and the only evidence for use of the Dinaledi subsystem by any hominin is by H. naledi. The context leads to the null hypothesis that H. naledi made the marks. In our revision, we will elaborate on this argument to clarify the evidence for our stance on this hypothesis. Several reviewers took issue with the title of the engraving paper as we did not insert a qualifier in front of the suggested date range for the engravings. We deliberately left out qualifying language so that the title took the form of a testable hypothesis rather than a weak assertation. Should future work find the engravings were not produced within this time range, then we will restate this hypothesis.

      Finally, with regards to the engravings we have chosen to report them because they exist. Not reporting the presence of engraved marks on the walls of a cave above hypothesized burials would be tantamount to leaving relevant evidence out of the description of an archeological context. We recognize and state in our manuscript that these markings require substantial further study, including attempts at geochronological dating. But the current evidence is clearly relevant to the archaeological context of the subsystem. We take a similar stance with reporting the presence of the tool shaped artefact near the hand of the H. naledi skeleton in the Hill Antechamber. It is evident that this object requires further study, as we stated in our manuscript, but again omitting it from our study would be leaving out relevant evidence.

      Some have suggested that the null hypothesis should be that all of these observed circumstances are of natural origin. Our team took this approach in our early investigation of the Dinaledi subsystem (Dirks et al. 2015). We adopted the null hypothesis that the geological processes involved in the accumulation of H. naledi skeletal remains were “natural” (e.g., non-naledigenic involvement), and we were able to reject many alternative explanations for the assemblage, including carnivore accumulation, “death trap” accumulation, and fluvial transport of bodies or bones (Dirks et al. 2015). This led us to the hypothesis that H. naledi were involved in bringing the bodies into the spaces where they were found. But we did not hypothesize their involvement in the formation of the deposit itself beyond bringing the bodies to the location.

      This approach seems conservative. It followed the traditional view that small-brained hominins do not engage in cultural practices. But we recognize in hindsight that this null hypothesis approach did harm to our analyses. It impeded us from recognizing within our initial excavations of the puzzle box area and other excavations between 2014 – 2017 that we might be encountering remains that were intrusive in the sedimentary floor of the chamber. If we had approached the accumulation of a large number of hominins from the perspective of the null hypothesis being that the situation was likely cultural, we perhaps would have collected evidence in a slightly different manner. We certainly note that if the Dinaledi system had been full of the remains of modern humans, there would have been little doubt that the null hypothesis would have been that this was a cultural space and not a “natural space”.  We therefore respectfully disagree with the reviewers who continue to support the idea that we should approach hominin excavations with the null hypothesis that they will be natural (specifically non-cultural) in origins. If excavations continue with this mindset we believe that potential cultural evidence is almost certain to be lost.

      There has been a gradient across paleoanthropological excavations, archaeological work, and forensic investigation, with increasing precision of context. The reality is that the recording precision and frame of approach is typically different in most paleontological excavations than in those related to contemporary human remains. If anything comes from the present discussion of whether the Dinaledi system is a burial site for H. naledi or not, we hope that by taking seriously the possibility of deep cultural dynamics of hominins, we will encourage other teams to meet the highest standards of excavation in order to preserve potential cultural evidence. Given H. naledi’s cranial capacity we suggest that even very early hominin skeletal assemblages should be re-examined, if there is sufficient evidence or records available.  These would include examples such as the A.L. 333 Au. afarensis site (the so called First Family site in Hadar Ethiopia), the Dikika infant skeleton, WT 15000 (Turkana Boy) and even A.L. 288 (Lucy) as such unusual taphonomic situations where skeletons are preserved cannot be simply explained away as “natural” in origin, based solely on the cranial capacity and assumed lack of cognitive and cultural complexity of the hominins as emphasized by us in Fuentes et al. (2023). We are not the first to observe that some very early hominin situations may represent early mortuary activity (Pettitt 2013), but we would advocate a step further. We suggest it may be damaging to take “natural accumulation” as the standard null hypothesis for hominin paleoanthropology, and that it is more conservative in practice to engage remains with the null hypothesis of possible cultural formation.

      We are deeply grateful for the time and effort all of the 8 reviewers (across three reviews) have taken with this work.  We also acknowledge the anonymous reviewers from previous submissions who’s opinions and comments will have made the final iterations of these manuscripts better for their efforts. As this process is rather public and includes commentary outside of the eLife forum, we ask that the efforts of all 37 authors and 8 reviewers involved be respected and that the discourse remain professional in all venues as we study this fascinating and quite complex occurrence. We appreciate also the efforts of members of the public who have engaged with this relatively new process where preprints are posted prior to the reviews allowing comments and interactions from colleagues and the public who are normally not part of the internal peer review process.  We believe these interactions will make for better final papers. We feel we have met the standards of demonstrating burials in H. naledi and that the engraving are most likely associated with H. naledi. However, given the reviews we see many areas where our clarity and context, and analyses, were less strong than they can be. With the clarifications and additions taken on board through these review processes the final papers will be stronger and clearer. We, recognize that this is an ongoing process of scientific investigation and further work will allow continued, and possibly better, evaluation of these hypothesis and others.

      Lee R Berger, Agustín Fuentes, John Hawks, Tebogo Makhubela

      Works cited:

      • Aspöck, E. (2008). What Actually is a ‘Deviant Burial’?: Comparing German-Language and Anglophone Research on ‘Deviant Burials.’ In E. M. Murphy (Ed.). Deviant Burial in the Archaeological Record. Oxford: Oxbow Books.  pp 17–34.

      • Bolliger, S.A. & Thali, M.J. (2009). Thanatology. In S.A. Bolliger and M.J. Thali (eds) Virtopsy Approach:  3D Optical and Radiological Scanning and Reconstruction in Forensic Medicine. Boca Raton: CRC Press. pp 187-218.

      • Boulestin, B. & Duday, H. (2005). Ethnologie et archéologie de la mort: de l’illusion des références à l’emploi d’un vocabulaire. In: C. Mordant and G. Depierre (eds) Les Pratiques Funéraires à l’Âge du Bronze en France. Actes de la table ronde de Sens-en-Bourgogne. Paris: Éditions du Comité des Travaux Historiques et Scientifiques. pp. 17–30.

      • Boulestin, B. & Duday, H. (2006). Ethnology and archaeology of death: from the illusion of references to the use of a terminology. Archaeologia Polona 44: 149–169.

      • Bristow, J., Simms, Z. & Randolph-Quinney, P.S. Taphonomy. In S. Black and E. Ferguson (eds.) Forensic Anthropology 2000-2010. Boca Raton, FL: CRC Press. pp 279-318.

      • Channing, J. & Randolph-Quinney, P.S. (2006). Death, decay and reconstruction: the archaeology of Ballykilmore Cemetery, County Westmeath. In J. O’Sullivan and M. Stanley (eds.) Settlement, Industry and Ritual: Archaeology. National Roads Authority Monograph Series No. 3. Dublin: NRA/Four Courts Press. pp 113-126.

      • Cherryson, A. K. (2008). Normal, Deviant and Atypical: Burial Variation in Late Saxon Wessex, c. AD 700–1100. In E. M. Murphy (Ed.). Deviant Burial in the Archaeological Record. Oxford: Oxbow Books. pp 115–130.

      • Connolly, M., F. Coyne & L. G. Lynch (2005). Underworld : Death and Burial in Cloghermore Cave, Co. Kerry. Bray, Co. Wicklow: Wordwell.

      • Darwent, C. M. & R. L. Lyman (2002). Detecting  the postburial fragmentation of carpals, tarsals and phalanges. In M. H. Sorg and W. D. Haglund (eds). Advances in Forensic Taphonomy: Method, Theory and Archeological Perspectives. Boca Raton, FL, CRC Press. pp 355-378.

      • d’Errico, F., & Backwell, L. (2016). Earliest evidence of personal ornaments associated with burial: The Conus shells from Border Cave. Journal of Human Evolution, 93, 91–108.

      • De Villiers. H. (1973). Human skeletal remains from Border Cave, Ingwavuma District, KwaZulu, South Africa. Annals of the Transvaal Museum, 28(13), 229–246.

      • Dell’Unto, N. and Landeschi, G. (2022). Archaeological 3D GIS. London: Routledge.

      • Dibble, H. L., Aldeias, V., Goldberg, P., McPherron, S. P., Sandgathe, D., & Steele, T. E. (2015). A critical look at evidence from La Chapelle-aux-Saints supporting an intentional Neandertal burial. Journal of Archaeological Science, 53, 649–657.

      • Dirkmaat, D. C., & Cabo, L. L. (2016). Forensic archaeology and forensic taphonomy: basic considerations on how to properly process and interpret the outdoor forensic scene_. Academic Forensic Pathology_ 6, 439–454.

      • Dirks, P. H., Berger, L. R., Roberts, E. M., Kramers, J. D., Hawks, J., Randolph-Quinney, P. S., Elliott, M., Musiba, C. M., Churchill, S. E., de Ruiter, D. J., Schmid, P., Backwell, L. R., Belyanin, G. A., Boshoff, P., Hunter, K. L., Feuerriegel, E. M., Gurtov, A., Harrison, J. du G., Hunter, R., … Tucker, S. (2015). Geological and taphonomic context for the new hominin species Homo naledi from the Dinaledi Chamber, South Africa. ELife, 4, e09561.

      • Dirks, P.H.G.M., Berger, L.R., Hawks, J., Randolph-Quinney, P.S., Backwell, L.R., and Roberts, E.M. (2016). Comment on “Deliberate body disposal by hominins in the Dinaledi Chamber, Cradle of Humankind, South Africa?” [J. Hum. Evol. 96 (2016) 145-148]. Journal of Human Evolution 96:  149-153.

      • Dirks, P. H., Roberts, E. M., Hilbert-Wolf, H., Kramers, J. D., Hawks, J., Dosseto, A., Duval, M., Elliott, M., Evans, M., Grün, R., Hellstrom, J., Herries, A. I., Joannes-Boyau, R., Makhubela, T. V., Placzek, C. J., Robbins, J., Spandler, C., Wiersma, J., Woodhead, J., & Berger, L. R. (2017). The age of Homo naledi and associated sediments in the Rising Star Cave, South Africa. ELife, 6, e24231.

      • Donnelly, S., C. Donnelly & E. Murphy (1999). The forgotten dead: The cíllíní and disused burial grounds of Ballintoy, County Antrim. Ulster Journal of Archaeology 58, 109-113.

      • Duday, H. (2005). L’archéothanatologie ou l’archéologie de la mort. In: O. Dutour, J.-J. Hublin and B. Vandermeersch (eds) Objets et Méthodes en Paléoanthropologie. Paris: Comité des Travaux Historiques et Scientifiques. pp. 153–215.

      • Duday, H. (2009). Archaeology of the Dead: Lectures in Archaeothanatology. Oxford: Oxbow Books.

      • Finley, N. (2000). Outside of life: Traditions of infant burial in Ireland from cillin to cist.  World Archaeology 31, 407-422.

      • Gargett, R. H. (1999). Middle Palaeolithic burial is not a dead issue: The view from Qafzeh, Saint-Césaire, Kebara, Amud, and Dederiyeh. Journal of Human Evolution, 37(1), 27–90.

      • Goldberg, P., Aldeias, V., Dibble, H., McPherron, S., Sandgathe, D., & Turq, A. (2017). Testing the Roc de Marsal Neandertal “Burial” with Geoarchaeology. Archaeological and Anthropological Sciences, 9(6), 1005–1015.

      • Gómez-Olivencia, A., & García-Martínez, D. (2019). New postcranial remains from the Roc de Marsal Neandertal child. PALEO. Revue d’archéologie Préhistorique, 30–1, 30–1.

      • Green, E.C. (2022). An archaeothanatological approach to the identification of late Anglo-Saxon burials in wooden containers. In C.J. Knüsel and E.M.J. Schotsmans (eds.) The Routledge Handbook of Archaeothanatology. London: Routledge. pp 436-455.

      • Henderson, J. (1987). Factors determining the state of preservation of human remains. In A. Boddington, A. Garland and R. Janaway (eds). Death, Decay and Reconstruction: Approaches to Archaeology and Forensic Science. Manchester: Manchester University Press. pp 43-54.

      • Hunter, J. R. (2014). Human remains recovery: archaeological and forensic perspectives. In C. Smith (ed). Encyclopedia of Global Archaeology. New York: Springer New York. pp 3549-3556.

      • Hochrein, M. (2002). An Autopsy of the Grave: Recognizing, Collecting and Preserving Forensic Geotaphonomic Evidence. In M. H. Sorg and W. D. Haglund (eds). Advances in Forensic Taphonomy: Method, Theory and Archeological Perspectives. Boca Raton, FL, CRC Press: 45-70.

      • Knüsel, C.K. & Robb, J. (2016). Funerary taphonomy: An overview of goals and methods. Journal of Archaeological Science: Reports 10, 655-673.

      • Kuhn, B.F., Berger, L.R. & Skinner, J.D. (2010). Examining criteria for identifying and differentiating fossil faunal assemblages accumulated by hyenas and hominins using extant hyenid accumulations. International Journal of Osteoarchaeology 20, 15-35.

      • Lyman, R. (1994). Vertebrate Taphonomy. Cambridge, Cambridge University Press.

      • Martinón-Torres, M., d’Errico, F., Santos, E., Álvaro Gallo, A., Amano, N., Archer, W., Armitage, S. J., Arsuaga, J. L., Bermúdez de Castro, J. M., Blinkhorn, J., Crowther, A., Douka, K., Dubernet, S., Faulkner, P., Fernández-Colón, P., Kourampas, N., González García, J., Larreina, D., Le Bourdonnec, F.-X., … Petraglia, M. D. (2021). Earliest known human burial in Africa. Nature, 593(7857), 7857.

      • Mickleburgh, H.L & Wescott, D.J. (2018). Controlled experimental observations on joint disarticulation and bone displacement of a human body in an open pit: implications for funerary archaeology. Journal of Archaeological Science: Reports 20: 158-167.

      • Mickleburgh, H.L., Wescott, D.J., Gluschitz, S. & Klinkenberg, V.M. (2022). Exploring the use of actualistic forensic taphonomy in the study of (forensic) archaeological human burials: An actualistic experimental research programme at the Forensic Anthropology Center at Texas State University (FACTS), San Marcos, Texas. In C.J. Knüsel and E.M.J. Schotsmans (eds.) The Routledge Handbook of Archaeothanatology. London: Routledge. pp 542-562.

      • Owsley, D. & B. Compton (1997). Preservation in late 19th Century iron coffin burials. In W. Haglund and M. Sorg (eds). Forensic Taphonomy: The Postmortem Fate of Human Remains. Boca Raton, FL, CRC Press: 511-526.

      • Parker Pearson, M. (1999). The Archaeology of Death and Burial. College Station: Texas A&M University Press.

      • Pettitt, P. (2013). The Palaeolithic Origins of Human Burial. Routledge.

      • Pomeroy, E., Bennett, P., Hunt, C. O., Reynolds, T., Farr, L., Frouin, M., Holman, J., Lane, R., French, C., & Barker, G. (2020). New Neanderthal remains associated with the ‘flower burial’ at Shanidar Cave. Antiquity, 94(373), 11–26.

      • Randolph-Quinney, P.S. (2013). From the cradle to the grave: the bioarchaeology of Clonfad 3 and Ballykilmore 6. In N. Brady, P. Stevens and J. Channing (eds.). Settlement and Community in the Fir Tulach Kingdom. Dublin: National Roads Authority Press. pp A2.1-48.

      • Randolph-Quinney, P.S., Haines, S. and Kruger, A. (2018). The use of three-dimensional scanning and surface capture methods in recording forensic taphonomic traces: issues of technology, visualisation, and validation. In: W.J. M. Groen and P. M. Barone (eds). Multidisciplinary Approaches to Forensic Archaeology. Berlin: Springer International Publishing, pp. 115-130.

      • Rendu, W., Beauval, C., Crevecoeur, I., Bayle, P., Balzeau, A., Bismuth, T., Bourguignon, L., Delfour, G., Faivre, J.-P., Lacrampe-Cuyaubère, F., Tavormina, C., Todisco, D., Turq, A., & Maureille, B. (2014). Evidence supporting an intentional Neandertal burial at La Chapelle-aux-Saints. Proceedings of the National Academy of Sciences, 111(1), 81–86.

      • Sandgathe, D. M., Dibble, H. L., Goldberg, P., & McPherron, S. P. (2011). The Roc de Marsal Neandertal child: A reassessment of its status as a deliberate burial. Journal of Human Evolution, 61(3), 243–253.

      • Silver, M. (2016). Conservation Techniques in Cultural Heritage. In E. Stylianidis and F. Remondino (eds) 3D Recording, Documentation and Management of Cultural Heritage. Dunbeath: Whittles Publishing. pp 15-106.

      • Schotsmans, E.M.J., Georges-Zimmermann, P., Ueland, M. and Dent, B.B. (2022). From flesh to bone: Building bridges between taphonomy, archaeothanatology and forensic science for a better understanding of mortuary practices. In C.J. Knüsel and E.M.J. Schotsmans (eds.) The Routledge Handbook of Archaeothanatology. London: Routledge. pp 501-541.

    1. eLife assessment

      The authors study the context of the skeletal remains of three individuals and associated sediment samples to conclude that the hominin species Homo naledi intentionally buried their dead. Demonstration of the earliest known instance of intentional funerary practice – with a relatively small-brained hominin engaging in a highly complex behavior that has otherwise been observed from Homo sapiens and Homo neanderthalensis – would be a landmark finding. However, the evidence for these claims is considered inadequate in the current version of the study. The four reviewers were in strong consensus that the methods, data, and analyses do not support the primary conclusions. Without full excavations, the study is missing crucial geoarchaeology (especially micromorphology) and taphonomic components, among other limitations, that make premature the conclusion that H. naledi intentionally buried their dead. The null hypothesis must be that these skeletons accumulated naturally and the research must then reject the null hypothesis and robustly exclude equifinality in order to justifiably draw the remarkable conclusions made in the present version of the paper.

    2. Reviewer #1 (Public Review):

      The discovery of Homo naledi fossils and the rising star cave system is unquestionably important for paleoanthropology. The fossils themselves hold a wealth of information about the diversity and complexity of morphological and evolutionary change on the hominin family tree. It is a truly amazing find and important site and it is important that information about this site continues to be produced so that more can be known. It is equally important that the papers produced from the site be fully reviewed for scientific rigor. I hope to assist with this to the best of my ability.

      In its current form the paper, "Evidence for the deliberate burial of the dead by Homo naledi," does not meet the standards of our field. The paper is hard to follow. It lacks key citations, contextual background information to inform the reader about the geological and depositional structure of the caves, and concise understandable descriptions of the methods and the significance of the results.

      The main point of the paper is to describe three possible burial features. The working hypothesis is that the features are intentional burials, and the authors seek to support this hypothesis throughout rather than test it. The authors do this by noting mineralogical differences in sediment and possible bowl-shaped sedimentological distinctions where fossil bones occur. As stated above, this evidence needs to be elaborated on the in text, contextualized, and edited for clarity. In addition, throughout the paper, the authors only consider two depositional scenarios for burial and body decomposition: 1) a body was intentionally buried in a pit that was dug into the cave sediments, and then buried in sediment (without detailing in the main text what sediment was used to backfill the pit); and 2) the body was left in a natural pit and decayed in the open. A major problem with only considering these two scenarios for body decomposition is that previous reports about cave geology and sedimentology show that it is a dynamic system involving erosion, sediment slumping and drainage, and contraction of clay, which is a major component of the sediment, etc. The authors are very clear that flooding is not a viable option for the movement of skeletal elements in the cave. However, they do not mention other processes such as erosion or sediment slumping, that are known to occur and could be responsible for moving sediment and fossils in each chamber of the cave. They also do not consider carnivore involvement which has been suggested by Val (2016) and Egeland et al. (2018). Such processes could naturally transport bodies, shift them around, and sediment erosion could bury them. The articulation of some skeletal elements is a major argument for intentional burial, yet within the cave substructure, articulated bones are often commingled with disarticulated elements from the same or different individuals. This same situation exists in the features included in this paper. It does appear that some skeletal material was covered in sediment before decomposition and remains in articulation, but bodies decompose at different rates, and can decompose slowly, especially in environments that lack insects (see Simmons et al. 2010 Journal of Forensic Sciences https://doi-org.aurarialibrary.idm.oclc.org/10.1111/j.1556-4029.2009.01206.x). Wiersma et al., 2019 describe the cave system as very humid, but dry due to little standing water, mildly acidic, with an average temperature today of 18{degree sign}C and a minimum of 12{degree sign}C over the last million years. The starting null hypothesis should be that the bodies were naturally covered in sediment. Intentional burial requires extraordinary circumstances and requires multiple lines of solid evidence to support the hypothesis. In testing for natural burial processes, the rate of body decomposition should be reconstructed given the environmental parameters of the cave.

      In keeping with supporting their starting hypothesis that Homo naledi intentionally buried individuals in the cave, the authors conclude that "A parsimonious explanation for this configuration of skeletal remains is that these remains may be a palimpsest of burials that have sequentially disrupted each other. In this hypothesis, early burials were disturbed when pits were dug for subsequent burials. Other occurrences of remains outside of the Dinaledi Chamber and Hill Antechamber (Hawks et al., 2017; Brophy et al., 2021) are discussed as possible evidence of mortuary practices in SI 4.2. Instances where parts of individuals occur in remote narrow passages cannot be explained as a result of carnivore or water transport (Elliott et al., 2021; Brophy et al., 2021), making it necessary to consider that H. naledi may have placed these partial remains in these locations, possibly representing a form of funerary caching." After reviewing the evidence presented in the current manuscript, it is not clear why this is a parsimonious explanation. The authors have repeatedly described how incredibly challenging it is to get into and out of this cave system and all of its chambers. How could any species, even small bodied species, drag/pull/shove dead bodies through small crevasses, shove or drop them down a narrow shoot, continue to move through the hill antechamber to the Dineledi chamber and bury bodies? It is not impossible but given the previously published descriptions of the dynamic process of sedimentation movement in the cave it is certainly not a parsimonious explanation. To support this will take many more lines of evidence than presented here such as micromorphological analysis of the overall cave system and each feature (discussed in the supplementary information but briefly), full detailed reconstruction of sediment, water, fossil, and debris movement throughout the cave system coupled with reconstructions of body decomposition rates. Scientifically precise computer-generated reconstructions of all of this are possible working with specialists affiliated with National Geographic. An analysis also needs to start by testing a null hypothesis, not deciding on the conclusion and setting out to "prove" it.

    3. Reviewer #2 (Public Review):

      In this study (Berger et al.), geological and fossil data from the Rising Star Cave System in South Africa are presented to provide evidence for intentional burials of Homo naledi individuals. The authors focus on describing and interpreting what they refer to as "delimited burial features." These features include two located on the floor of the Dinaledi Chamber (referred to as 'Dinaledi Features' 1 and 2) and one from the floor of the Hill Antechamber.

      'Dinaledi Feature 1' consists of a collection of 108 skeletal elements recovered from sub-unit 3b deposits. These remains are believed to primarily represent the remains of a single adult individual, along with at least one additional juvenile individual. Although additional anatomical elements associated with 'Dinaledi Feature 1' are mentioned, they are not described as they remain unexcavated. The study states that the spatial arrangement of the skeletal remains is indicative of the primary burial of a fleshed body. On the other hand, 'Dinaledi Feature 2' is not extensively discussed, and its complete extent was not thoroughly investigated.

      Regarding the Hill Antechamber feature, it was divided into three separate plaster jackets for removal from the excavation. Through micro-CT and medical CT scans of these plaster jackets, a total of 90 skeletal elements and 51 dental elements were identified. From these data, three individuals were identified, along with a fourth individual described as significantly younger. Individuals 1 and 2 are classified as juveniles.

      I feel that there is a significant amount of missing information in the study presented here, which fails to convince me that the human remains described represent primary burials, i.e. singular events where the bodies are placed in their final resting places. Insufficient evidence is provided to differentiate between natural processes and intentional funerary practices. In my opinion, the study should include a section that distinguishes between taphonomic changes and deliberate human modifications of the remains and their context, as well as reconstruct the sequence and timeline of events surrounding death and deposition. A deliberate burial involves a complex series of changes, including decomposition of soft tissues, disruption of articulations between bones, and the sequence of skeletonization. While the geological information is detailed, the archaeothanatological reasoning (see below) is largely absent and, when presented, it lacks clarity and unambiguousness.

      My main concern is that the study does not apply or cite the basic principles of archaeothanatology, which combines taphonomy, anatomy, and knowledge of human decomposition to interpret the arrangement of human bones within the Dinaledi Chamber and the Hill Antechamber. Archaeothanatology has been developed since the 1970s (see Duday et al., 1990; Boulestin and Duday, 2005; Duday and Guillon, 2006) and has been widely used by archaeologists and osteologists to reconstruct various aspects such as the original treatment of the body, associated mortuary practices, the sequence of body decomposition, and the factors influencing changes in the skeleton within the burial.

      Specifically, the study lacks a description of the relative sequence of joint disarticulation during decomposition and the spatial displacement of bones. A detailed assessment of the anatomical relationships of bones, both articulated and disarticulated, as well as the direction and extent of bone displacement, is missing. For instance, while it is mentioned that "many elements are in articulation or sequential anatomical position," a comprehensive list of these articulated elements and their classification (as labile or not) is not provided.

      Furthermore, the patterns described are not illustrated in sufficient detail. If Homo naledi was deliberately buried, it would be crucial to present illustrations depicting the individuals in their burial positions, as well as the representation and proportions of the larger and smaller anatomical elements for each individual. While Figure 2B provides an overall view of 'Dinaledi Feature 1,' it is challenging to determine the relationships of bones, whether articulated or disarticulated, in Figures 2C or 2D. Such information is essential to determine whether the bones are in a primary or secondary position, differentiate between collective and multiple burials, ascertain the body's stage of decomposition at the time of burial, identify postmortem and post-depositional manipulation of the body and grave (e.g., intentional removal of bodies/body parts), and establish whether burial occurred immediately after death or was delayed.

      Moreover, the study does not address bone displacements within secondary voids created after the decomposition of soft tissues, nor does it provide assessments of the position of bones within or outside of the original body volume. Factors such as variations in soft tissue volume between individuals of different sizes/corpulence, and the progressive filling (i.e., sediment continually fills newly formed voids) or delayed filling (causing the 'flattening' of the ribcage and 'hyper-flexed' burials, for instance) of secondary open spaces with sediment over time should also be discussed.

      In conclusion, while I acknowledge the importance of investigating potential deliberate burials in Homo naledi, I do not think that in its present form, the evidence presented in this study is as robust as it should be.

    4. Reviewer #3 (Public Review):

      This paper provides new information on the Dinaledi Chamber at the Rising Star Cave System. In short, a previously excavated area was expanded and resulted in the discovery of a cluster of bones appearing to be of one individual, a second similar cluster, and a third cluster with articulated elements (though with several individuals). Two of these clusters are argued to be intentionally buried individuals (the third one has not been investigated) and thus Homo naledi not only placed conspecifics in deep and hard to reach parts of caves but also buried them (apparently in shallow graves). This would be the oldest evidence of intentional burial. The main issue with the paper is that the purported burials were not fully excavated. Two are still in the ground, and one was removed in blocks but left unexcavated. As burials are mostly about sediments, it means the authors are lacking important lines of evidence. Instead, they bring other lines of argument as outlined below. While their preferred scenario is possible, there are important issues with the evidence as presented and they are severely hampered by the lack of detailed archaeological and geoarchaeological information both from the specific skeletal contexts and more generally from the chamber (because in fact the amount of excavation conducted here is still quite limited in scope). I also found that while the presentations of the various specialists in the team was quite good, the integration of these contributions into the main text was not. In particular, the geology of the cave system and the chamber need (especially what is known of the depositional and post-depositional processes) need to be better integrated into the presentation of the archaeology and the interpretation of the finds.

      Often times the presence of articulated or mostly articulated skeletons is used to argue for intentional burial. This argument, however, is based on the premise that if not buried, these skeletons would have otherwise become disarticulated. Normally disarticulation would happen as a result of subsequent use of the site by hominins (e.g. purported burials in Neandertal cave sites) or by carnivores scavenging the body. Indeed this latter point is why bodies are buried so deeply in many Western societies (i.e. beyond the reach and smell of carnivores). Bodies can also be disarticulated by natural processes of deposit and erosion.

      However, here in the case of the Dinaledi Chamber, we apparently don't have any of these other processes. The chamber was not used by carnivores and it was not a living area where H. naledi would have frequently returned and cleared out the space. As for depositional processes, it is more complex, but it is clear from Wiersma et al. that there is a steady, constant movement of these sediments towards drains. They also think that this process can account for the mix of articulated and non-articulated elements in the cave. Importantly, that same paper makes the argument that the formation of these sediments is not the result of water movement and that the cave has been dry since the formation of this deposit. So bodies lying on the surface and slowly covered by the formation of the deposit and slowly moving towards the drains could perhaps account for the pattern observed, meaning burial is not needed to account for articulations (note that more information on fabrics would be good in this context - orientation analysis of surface finds or of excavated finds is either completely lacking or minimal - see figure 13b and c report orientations on 79 bones of unknown context that appear to show perhaps elevated plunge angles and some slightly patterning in bearing but there is no associated statistics or text explaining the significance).

      So, unless the team can provide some process that would have otherwise disarticulated these skeletons after the bodies arrived here and decomposed, their articulated state is not evidence of burial (no more than finding an articulated or mostly articulated bear skeleton deep in a European cave would suggest that it was buried).

      As for the elemental analysis, what I understood from the paper is that the sediment associated with bones is different from the sediment not associated with bones. It is therefore unsurprising that the sediment associated with the reported skeletons clusters with sediments with bones. The linking argument for why this makes this sediment pit fill is unclear to me. Perhaps it is there, but as written I didn't follow it.

      What the elemental analysis could suggest, I think, is that there has not been substantial reworking of the sediments (as opposed to the creep suggested by Wiersma et al.) since the bones leached these minerals into the sediment. What I don't know, and what is not reported, is how long after deposition we can expect the soil chemistry to change. If this elemental analysis were extended in a systematic way across the chamber (both vertically and horizontally) after more extensive excavations, I could see it perhaps being useful for better understanding the site formation processes and depositional context. As it is now, I did not see the argument in support of a burial pit.

      The other line of evidence here is that some bones are sediment supported. The argument here is that when a body decomposes, bones that were previously held in place by soft tissues will be free to move and will shift their position. How the bones shift will differ depending on whether the body is surrounded by matrix (as they argue here in an excavated burial pit) or whether it is in the open (say, for instance, in a coffin) (and there are other possibilities as well - for instance wrapped in a shroud). Experiments have also shown the order in which the tendons, for instance, decompose and therefore which bones are likely to be free to move first or last.

      I will note that this literature is poorly cited. I think the only two papers cited for how bodies decompose are Roksandic 2002 and Mickleburgh and Wescott 2019. The former is a review paper that summarizes a great many contexts that are clearly not appropriate here, and it generally makes the point that it is difficult to sort out, and it notes that progressively filled is an additional alternative to not buried/buried. The other looks at experimental data of bodies decomposing without being buried. In the paper here, this citation is used to argue that the body must have been buried. I don't see the linking argument at all. And the cited paper is mostly about how complicated it is to figure this all out and how many variables are still unaccounted for (including the initial positioning of the body and the consumption of the body by insects - something that is attested to at Naledi - plus snails - see not just Val but also Wiersma et al. and I think the initial Dirk et al. paper).

      So the team here instead simply speaks of how the body decomposes in burials as if it is known. For the Feature 1 skeleton, the authors note that the ribs are "apparently" sediment supported and that a portion of the partial cranium is vertical or subvertical and sediment supported. For both of these, the figures show it very poorly. We really have to take their word for it. Second, I would have liked to have seen some reference and comparison to the literature for how the ribs should be in sediment burial cases. For the cranium, seems like a broken cranium resting on a surface will have vertical aspects regardless of sediment support. To the contrary, the orientation of the cranium will change depending on whether there is sediment holding it in place or not. But that argument is not made here. It is very hard from the figures to have a detailed idea of how these skeletons are oriented in the sediments, to know which elements are in articulation, which are missing, etc.

      In the case of the Hill Antechamber Feature, an additional argument is made about the orientation of the finds in relation to the natural stratigraphy in this location. The team argues that the skeleton is lying more horizontally than the sediments and that in fact the foot is lying against the slope. First, there is no documentation of the slope of the layers here (e.g. a stratigraphic profile with the layers marked or a fabric analysis). There is a photo in the SI that says it shows sloping, but it needs some work. Second, this skeleton was removed in three blocks and then scanned. So the position of the skeleton is being worked out separate from its context. This is doable, but I would have liked to have seen some mention of how the blocks were georeferenced in the field and then subsequently in the lab and of how the items inside the block (i.e. the data coming from the CT scanner) were then georeferenced. I can think of ways I would try to do this, but without some discussion of this critical issue, the argument presented in Figure 10c is difficult to evaluate. Further, even if we accept this work, it is hard for me to see how the alignment of the foot is 15 degrees opposite the slope (the figure in the SI is better). It is also hard to understand the argument that the sediment separating the lower limb from the torso means burial. The team gives the explanation that if the body was in an open pit it would have been flat with no separation. Maybe. I mean I guess if the pit was flat. But there is no evidence here of a pit (at all). And what if the body was stuffed down the chute and was resting on a slope and covered with additional sediments from the chute (or additional bodies) as it decomposed? It seems that this should be the starting point here rather than imagining a pit.

      One of the key pieces of evidence for demonstrating deliberate burial is the recognition of a pit. Pits can be identified because of the rupture they create in the stratigraphy when older sediments are brought to the surface, mixed, and then refilled into the pit with a different color, texture, compaction, etc. In some homogenous sediments a pit can be hard to detect and in some instances post-depositional processes (e.g. burrowing) can blur the distinction between the pit and the surrounding sediments. But the starting point of any discussion of deliberate burial has to be the demonstration of a pit. And I don't see it here. It might just be that the figures need to be improved. But I am skeptical because the team has taken the view that these finds can't be excavated. While I appreciate the scanning work done on the Antechamber find, it is not the same as excavating. Same comment for Features 1 and 2.

      In short, my view is that they have an extremely interesting dataset. That H. naledi buried their dead here can't be excluded based on the data, but neither is it supported here. My view is that this paper is premature and that more excavation and the use of geoarchaeological techniques (especially micromorphology) are required to sort this out (or go a long way towards sorting it out).

    5. Reviewer #4 (Public Review):

      Berger et al. 2023a argues that Homo naledi intentionally buried their dead within the Rising Star cave system by digging pits and covering the bodies with infilled sediment. The authors identified two burials: Dinaledi Feature 1 from the Dinaledi Chamber, and the Hill Antechamber Feature from the Hill Antechamber. The evolutionary and behavioral implications for such behavior are highly significant and would be the first instance of a relatively small-brained hominin engaging is complex behavior that is often found in association with Homo sapiens and Homo neanderthalensis. Thus, the scientific rigor to validate these findings should be of the highest quality, and thus, provide clear documentation of intentional burial. In an attempt to meet these standards, the authors stated a series of tests that would support their hypothesis of intentional burials in the Rising Star Cave system:

      "The key observations are (1) the difference in sediment composition within the feature compared to surrounding sediment; (2) the disruption of stratigraphy; (3) the anatomical coherence of the skeletal remains; (4) the matrix-supported position of some skeletal elements; and (5) the compatibility of non-articulated material with decomposition and subsequent collapse." (page 5)

      To find support for the first (1) test, the authors collected sediment samples from various locations within the Rising Star Cave system, including sediment from within and outside Dinaledi Feature 1. However:

      • The authors did not select sediment samples from within the Hill Antechamber Feature, so this test was only used to assess Dinaledi Feature 1.

      • The sediment samples were analyzed using x-ray diffraction (XRD) and x-ray fluorescence (XRF) to test the mineralogy and chemistry of the samples from within and outside the feature. The XRF results were presented as weighted percentages (not intensities) with no control source reported. The weighted percentages were analyzed using a principal components analysis (PCA) while the particle-size distribution was analyzed using GRADISTAT statistics package and the Folk and Ward Method to summarize "mean grain size, sorting, skewness and kurtosis in addition to the percentages of clay, silt and sand in each sample." (page 28).

      • The PCA results were reported solely as a biplot without showing the PC scores projected into the loading space, which is unusual and does not present the data accurately. Instead, the authors present the scores of a single component (PC2, figure 3) because the authors interpreted this component as "distinctly delineates fossil-bearing sediments from sterile sediments based on the positive loadings of P and S" (Page 6). However, the supplementary table that reports XRF bulk chemistry results as a weighted percentage of minerals within each sample (SI Table 1) shows mostly an absence of data for both Na and S. Since Na is at the lower end of detection limits for the method, and S seems to just be absent from the list, the intentions of the authors for showing the inclusion of these elements in their PCA results is unclear. Given that this is the author's primary method for demonstrating a burial, this issue is particularly concerning and requires additional attention.

      • Regardless of the missing data, this reviewer attempted to replicate the XRF PCA results using the data provided in SI Table 1 and was unsuccessful. The samples that were collected from within the feature (SB) cluster with samples collected from sterile sediments and other locations around the cave system. Thus, these results are not replicable as currently reported.

      • Visual comparisons of sediment grain size, shape, and composition were qualitatively summarized. Grain size was plotted as a line graph and is buried as supplemental Figure S13 showing sample by color and area, but these results do not distinguish samples from WITHIN the burial compared to OUTSIDE the burial as the authors state in the methods as a primary goal.

      To test the second (2) aim, the "stratigraphy" was primarily described in text.

      • For Dinaledi Feature 1, the authors state that the layer around Feature 1 "is continuous in the profile immediately to the east of the feature; it is disrupted in the sediment profile at the southern extent of the feature (fig. 3b)." Upon examination of figure 3b, the image shows an incredibly small depiction of the south (?) profile view with an extremely large black box overlaying a large portion of the photograph containing a small 5 cm scale. Visually, there is no difference in the profile that would suggest a disruption in the form of a pit. The LORM (orange-red mud layer) does seem to become fragmentary, but no micromorphological analysis was conducted on this section to provide an evaluation of stratigraphic composition. Also, by only excavating a portion of the feature, the authors were unable to adequately demonstrate the full extent of this feature.

      • The authors attempt to describe "a bowl-shaped concave layer of clasts and sediment-free voids make up the bottom of the feature" (page 13) and refer to figures and supplementary information that do not depict any stratigraphic profile. Moreover, the authors state that "the leg, foot, and adjacent [skeletal?] material cut across stratigraphy" indicating that the skeleton is orientated on a flat plane against the surrounding stratigraphy that is "30{degree sign} slope of floor and underlying strata" (page 51, fig. 10c captions). There is no mention of infilled sediment from a pit and how this relates to the skeleton or the slope of the floor. It is therefore extremely unclear what the authors are meaning to describe without any visual or micromorphological supplementation to demonstrate a "bowl-shaped concave layer".

      The third (3) test was to evaluate the anatomical coherence of the skeletal remains using macro- and micro-CT (computed tomography) of the Hill Antechamber Feature that was removed during excavation. To visually assess the anatomy of the Dinaledi Feature 1 burial, the authors describe the spatial relationship of skeletal elements as they were being excavated but halted partway through the excavation.

      • The authors do not provide any documentation (piece-plotting, 3D rendering of stages of excavation, etc.) of the elements that were removed from the Dinaledi Feature. Figure 4 and SI Fig. S22 show the spatial relationship between identifiable skeletal elements that remain in the Feature. However, in Fig. 4, it is unclear why the authors chose to plot 2023-2014 excavated material along with material reported here, and it's even more difficult to understand the anatomical positioning of the elements given their color and point size choices. Although, the authors do provide a 3D rendering of the unexcavated remains showing some skeletal cohesion, apart from the mandible and teeth being re-located near the pelvis (Fig. 9). That said, it is very difficult to visually confirm the elements from this model or understand the original placement of the skeleton.

      • 3D renderings of the Hill Antechamber feature skeletal material is clearly shown in SI Fig. S26. Contrary to what the authors state in text, there is a rather wide dispersal and rearrangement of elements for a "burial" that is theoretically protected from scavengers and other agents that would aid in dispersing bone from the surface. The authors do not offer any alternatives to explain disturbance, such as human activity, which clearly took place.

      • Moreover, there does not appear to be any intentional arrangement of limbs that may suggest symbolic orientation of the dead (another line of evidence often used to support intentional burial but omitted by the authors). Thus, skeletal cohesion is not enough evidence to support the hypothesis of an intentional burial.

      The fourth (4) test was attempted by evaluating whether some elements were vertically aligned from 3D reconstructed models of Hill Antechamber Feature and a photogrammetric model of the Dinaledi Feature 1. The authors state that "the spatial arrangement of the skeletal remains is consistent with primary burial of the fleshed body" (page 8 in reference to Dinaledi Feature 1) without providing any evidence, qualitative or quantitative, that this is the case for either burial.

      Since this reviewer was unable to understand the fifth (5) test as it was written by the authors, I am unable to comment on the evidence to support this test and will default to the other reviewers for evaluation of this claim.

      In addition to a lack of evidence to support the claims of intentional burial, this paper was also written extremely poorly. For example, the authors often overused 'persuasive communication devices' (see eLife article, https://elifesciences.org/articles/88654) to mislead readers:

      "During this excavation, we recognized that the developing evidence was suggestive of a burial, due to the spatial configuration of the feature and the evidence that the excavated material seemed to come from a single body." (page 5)

      As an opening statement to introduce Dinaledi Feature 1, the authors state the interpretation and working hypothesis as fact before the authors present any evidence. This is known as "HARKing" and "gives the impression that a hypothesis was formulated before data were collected" (Corneille et al. 2023). This type of writing is pervasive throughout the manuscript and requires extensive editing. I recommend that the authors review the article provided by eLife (https://elifesciences.org/articles/88654) and carefully review the manuscript. Moreover, as this text demonstrates, the authors’ word choice is indicative of storytelling for a popular news article instead of a scientific paper. I highly suggest that the authors review the manuscript carefully and present the data prior to giving conclusions in a clear and concise manner.

      Moreover, the writing structure is inconsistent. Information that should be included in results is included in the methods, text in the results should be in discussions, and so forth. This inconsistency is pervasive throughout the entire manuscript, making it incredibly difficult to adequately understand what the authors had done and how the results were interpreted.

      Finally, the "artifact" that was described and visualized using CT models is just that - a digitally colored model. The object in question has not been analyzed. Until this object is removed from the dirt and physically analyzed, this information needs to be removed from the manuscript as there is nothing to report before the object is physically examined.

      Overall, there is not enough evidence to support the claim that Homo naledi intentionally buried their dead inside the Rising Star Cave system. Unfortunately, the manuscript in its current condition is deemed incomplete and inadequate, and should not be viewed as finalized scholarship.

    6. Author Response:

      We would like to thank the eLife reviewers for the considerable time and effort they have invested to review these manuscripts. We have also benefited from a previous round of review of the manuscript describing the proposed burial features, which underwent two rounds of revisions in a high-impact journal over a period of approximately 8 months during 2022 and early 2023. Both sets of reviews have reflected mixed responses to the evidence we have presented, with one reviewer recommending acceptance with minor editorial revisions, two recommending acceptance with minor revisions and the fourth recommending rejection based upon similar arguments to those reflected by some of the reviewers in this current round of reviews in eLife. Ultimately the managing editor of this first journal took the decision that the review process could not be completed in a timely manner and rejected the manuscript although the submission here reflected our consideration of these reviewers suggestions.

      We have chosen in this initial response to the eLife reviews to include some references to the previous anonymous reviews in order to illustrate differences of opinion and differences in revision suggestions within the review process. Our goal is to offer maximal insight into our decision-making process and to acknowledge the considerable time and effort put into the assessment of these manuscripts by reviewers (for eLife and in the case of the earlier review process). We hope that this approach will assist the readers, and reviewers, of our manuscripts in understanding why we are proceeding with certain decisions during the revision process.

      This is a new process for us and the reviewers, and one way in which it significantly differs from more traditional review is that both the reviews and our reply will be public well in advance of our revisions to the manuscript. Indeed, considering the scope of the reviews, some of those revisions may take considerable time, although many can be accomplished fairly easily. Thus, we are not in a position to say that we have solved every issue raised by the reviewers. Instead, we will examine what appear to be the key critical issues raised regarding the data and the analyses and how we propose to address these as we revise the papers. We will also address several philosophical and ethical issues raised by the reviews and our proposal for dealing with these. More specific editorial and citational recommendations will be dealt with on a case-by-case basis, and we do not address these point-by-point in this reply. Please note, this response to the reviewers is not the revision of the manuscript and is only the initial opinion of the corresponding authors with some guidance from the larger group of authors of all three papers. Our final submitted revision will reflect the input of all authors included on those submissions.

      We took the decision to submit three separate papers consciously. The two different categories of evidence, burials and engravings, involve different kinds of analysis and different (although overlapping) teams of researchers, and we recognized that each deserved their own presentation and assessment. Meanwhile, together they inform the context of H. naledi in a way that requires some synthetic discussion, in which both kinds of evidence are relevant, leading to a third paper. But the mutual relevance of these different kinds of evidence and their review by a common set of reviewers naturally raises cross-cutting issues, and the reviewers have cross-referenced the three articles. This has sometimes led to suggestions about one manuscript based on the contents of another. Considering the situation, we accepted the recommendation that it would be clearer to consider all three articles in a single reply. Thus, while each of the three papers will proceed separately during the revision process, it will be necessary to highlight across all three papers occasionally in our responses.

      Scientific Issues:

      In reading the reviews, we feel there are 9 critical points/assertions raised by one or more of the reviewers that present a problem for, or challenge to, our hypothesis that the observed evidence (bone accumulations and engravings) described in the Dinaledi subsystem are of intentional naledigenic origin. These are:

      1. The evidence presented does not demonstrate a clear interruption of the floor sediments, thus failing to demonstrate excavated holes.

      2. The sediments infilling the holes where the skeletal remains are found have not been demonstrated to originate from the disruption of the floor sediments and thus could be part of a natural geological process (e.g. water movement, slumping) or carnivore accumulations.

      3. Previous geological interpretations by our research group have given alternative geological explanations for formation of the bony accumulations that contradict the present evidence presented here and result in alternative origins hypotheses.

      4. Burial cannot be effectively assessed without complete excavation of the features and site.

      5. The skeletal remains as presented do not conform clearly to typical body arrangement/positions associated with human (Homo sapiens) burials.

      6. There is no evidence of grave goods or lithic scatters that are typically associated with human burials.

      7. Humans may have been involved with the creation of either the Homo naledi bone accumulations, the engravings, or both.

      8. Without a date of the engravings, the null hypothesis should be the engravings were created by Homo sapiens.

      9. The null hypothesis for explanation of the skeletal remains in this situation should be “natural accumulation”.

      Our analysis of the Dinaledi Feature 1 leads us to accept that the laminated orange-red mudstone (LORM) sedimentary layer is interrupted, indicating a non-natural intervention, and that the hole created by the interruption was then filled by both a fleshed body (and perhaps parts of other bodies) which were then covered by sediment that originated from the hole that was dug. We recognize that the four eLife reviewers are not convinced that our presentation is sufficient to establish this. Interestingly, this was not the universal opinion of earlier reviewers of the initial manuscript several of whom felt we had adequately supported this hypothesis. The lack of clarity in this current version of the burial manuscript is our responsibility. In the upcoming revision of this paper to be submitted, we will take the reviewers’ critiques to heart and add additional figures that illustrate better the disruption of the LORM and clarify the sedimentological data showing the material covering the skeletal remains in the hole are the disrupted sediments excavated from the same hole. We are proposing to isolate this most critical evidence for burial into a separate section in the revised submission based on the reviewers’ comments. The fact that the LORM layer is disrupted, a fleshed body was placed in the hole created by this disruption, and the body (and perhaps parts of other bodies) was/were then covered by the same sediments from the hole is the central feature of our hypothesis that the bone accumulations observed reflect a burial and not a natural process.

      The possibility of fluvial transport or involvement in the subsystem is a topic that we have addressed extensively in past work, and it is clear from these reviews that we must enhance our current manuscript to discuss this issue at greater length. Our previous work (Dirks et al. 2015; Dirks et al. 2017) emphasized that fluvial transport of whole bodies into the subsystem was precluded by several lines of sedimentological evidence. We excavated a rich accumulation of skeletal remains, including articulated limbs and other elements in subvertical orientations inconsistent with slow sedimentary infill, which were difficult to explain without positing either a large and dense pile of bodies and/or sediment movement. We encountered fractured chunks of laminated orange-red mudstone (LORM) in random orientations within our excavation area, within and among skeletal remains, which directly refuted that the remains were inundated with water at the time of burial, and this limited the possibility of fluvial transport. Water flow sufficient to displace bodies or complete skeletal evidence would also transport large and course sediment, which is absent from the subsystem, and would sort the commingled skeletal material that we found by size, which we do not observe. But our excavation only covered less than a square meter at very limited depth, and this was the limit to our knowledge of subsurface sediment. We thus were left with uncertainty that led us to suggest the possibility of sediment slumping or movement into subsurface drains, although these were not observed near our excavation. Our current work expands our knowledge of the subsurface and presents an alternative explanation for the disposition of skeletal remains from our earlier excavation. But we acknowledge that this new explanation is vulnerable to our own previous published proposals, and we must do a better job of explaining how the new information addresses our previous suggestions. By not clearly creating a section where we explained how these previous hypotheses were now nullified by new evidence, we clearly confused the reviewers with our own previous work. We will revise the manuscript by enhancing the review of the significant geological evidence demonstrating that there is no significant fluvial action in the system and making it clear how the burial hypothesis provides a clearer explanation for the situation of skeletal remains from our previous excavation work.

      One of the central issues raised by reviewers has been a perceived need to excavate these features completely, totally exhuming all skeletal remains from them. Reviewers have written that it is necessary to identify every skeletal element that is present and account for any missing elements. On this point, we have both ethical and scientific differences from these reviewers. We express our ethical concerns first. Many of the best-preserved possible burials ever discovered by archaeologists were subjected to total excavation and exhumation. Cases like La Chapelle-aux-Saints, La Ferrassie, and Skhūl were fully excavated at a time when data recording and excavation methods did not include the range of spatial and geomorphological approaches that later became routine. The judgment of early investigators that these situations were intentional burials was challenged by later workers, and the kind of information that might enable better tests had been irrevocably lost (Gargett 1999; Dibble et al. 2015; Rendu et al. 2014).

      Later, improved excavation standards have not sufficed to remove uncertainty or debate about possible burials. For example, it was long presumed that well-preserved remains of young children were by themselves diagnostic of intentional burial, such as those from Dederiyeh, Border Cave, or Roc de Marsal. Such cases were also fully excavated, with adequate documentation of the positioning of skeletal remains and their surrounding stratigraphic situation, but such cases were later challenged on several bases and the complete exhumation of material has confused or precluded testing of new hypotheses (e.g. Gargett 1999). The case of Roc de Marsal is one in which data from the initial excavation combined with data from the initial excavation combined with re-excavation and geoarchaeological analysis led to a naturalistic interpretation of the skeletal material (Sandgathe et al. 2011; Goldberg et al. 2017). But even in this case, the researchers erred in their interpretation of the skeleton’s situation due to a lack of identification of parts of the infant’s skeleton (Gómez-Olivencia and García-Martinez 2019). That is to say, it is not only the burial hypothesis but other hypotheses that suffer from complete excavation. Researchers concerned with preserving all possible information have sometimes taken extraordinary measures to remove and study possible burials at high-resolution in the laboratory. Such was the case of the Shanidar IV burial removed from the site and transported in plaster jacket by Solecki, which led to the disruption and loss of internal stratigraphic information (Pomeroy et al. 2020). Arguably, the current state of the art is full excavation with partial preparation, such as that undertaken at Panga ya Saidi (Martinón-Torres et al. 2021). But again, any future attempt to reinterpret or test the hypothesis of burial must rely on the adequacy of documentation as the original context has been removed.

      In our decision to leave material in place as much as possible, we are expanding upon standard practice to leave witness sections and unexcavated areas for future research. The situation is novel, representing possible burials by a nonhuman species, and that makes it doubly important in our opinion to be conservative in not fully exhuming the skeletal material from its context. We anticipate that many other researchers, including future investigators, will suggest additional methods to further test the hypothesis of burial, something that would be impossible if we had excavated the features in their entirety prior to publishing a description of our work. We believe strongly that our ethical responsibility is to publish the work and the most likely interpretation while leaving as much evidence in place as possible to enable further testing and replication. We welcome the suggestions of additional methods/analyses to test the H. naledi burial hypothesis.

      This being said, we also observe that total exhumation would not resolve the concerns raised by the reviewers. The recommendation of total exhumation is in pursuit of a full account of all skeletal material present and its preservation and spatial situation, in order to demonstrate that they conform to body positions comparable to human burials. As has been highlighted in forensic casework, the excavation of an inhumation feature does not necessarily provide an accurate spatial or anatomical manifest of the stratigraphical relationships between the body, encapsulating matrix, and any cut present due to preservational, taphonomic and operational factors (Dirkmaat and Cabo, 2016; Hunter, 2014). In particular, in cases where skeletal elements are highly fragmented, friable, or degraded (such as through bioerosion) then complete excavation—even under controlled laboratory conditions—may destroy bone and severely limit skeletal identification (Henderson, 1997; Hochrein, 2002; Owsley and Compton, 1997), particularly in elements where the ratio of trabecular to cortical bone is high (Darwent and Lyman, 2002; Lyman, 1994). As such, non-invasive methods of 3D and 4D modelling (preservation in situ) are often considered preferable to complete necropsy or excavation (preservation by record) where appropriate (Bolliger and Thali, 2009; Dell’Unto and Landeschi, 2022; Randolph-Quinney et al., 2018; Silver, 2016). 

      The test of burial is not primarily positional, but taphonomic and geological. The position and number of bones can elaborate on process-driven questions of decay and destruction in the burial environment, or post-mortem modification, but are not singularly indicative of whether the remains were intentionally buried – the post-mortem narrative of all the processes affecting the cadaveric island is required (Knüsel and Robb, 2016). In previous cases, researchers have disputed or accepted the hypothesis of intentional hominin burial based upon assumptions about how modern humans or Neandertals would have positioned bodies, with the idea that some positions reflect ritual intent while others do not. But applying such assumptions is unjustifiable, particularly for a species like H. naledi, whose culture may have differed fundamentally from our own. Our work acknowledges that the present evidence does not enable a full reconstruction of the burial positions, but it does show that fleshed remains were encased in sediment prior to decomposition of soft tissue, and that subsequent spatial changes can be most parsimoniously explained by natural decomposition within sedimentary matrix contained within a burial feature (after Green, 2022; Mickleburgh and Wescott, 2018; Mickleburgh et al., 2022). If the argument is that extraordinary claims require extraordinary evidence, we feel that the evidence documents excavation and interment (and will do so more clearly in the revision) and the fact of the remains do not match a “typical” human burial in body positioning is not in itself evidence that these are not H. naledi burials.

      We feel that the reviewers (in keeping with many palaeoanthropologists) have a clear idea of what they “think” a burial should look like in an idealised sense, but this platonic ideal of burial form is not matched by the extensive literature in archaeothanatology, funerary archaeology and forensic science which indicates enormous variability in the activity, morphology and post-mortem system experienced by the human body in cases of interment and body disposal (e.g. Aspöck, 2008; Boulestin and Duday, 2005 and 2006; Connelly et al., 2005; Channing and Randolph-Quinney, 2006; Cherryson, 2008; Donnelly et al., 1995; Finley, 2000; Hunter, 2014; Parker Pearson, 1999; Randolph-Quinney, 2013). Decades of experience in the identification, recovery and interpretation of clandestine, deviant, and non-formal burials indicates the platonic ideal is rare, and in many contexts, the exception (Cherryson, 2008; Parker Pearson, 1999). This variability is particularly relevant to morphological traits in burial context, such as the informal nature of the grave cut in plan and section, shallow burial depth, and initial disposition of body (placement) during the early post-mortem period. These might run counter to the expectations of reviewers or others referencing the fossil hominin record, but are well accepted within the communities of researchers investigating Holocene archaeological sites and forensic contexts.

      It is encouraging to see reviewers beginning to incorporate the extensive (often experimentally derived) literature from archaeothanatology and forensic taphonomy in their deliberations, and we will be taking these comments on board going forward. In particular, we acknowledge reviewers’ comments and the need to construct a more detailed post-mortem narrative, accounting for joint disarticulation (labile versus persistent joints etc), displacement, and final disposition of elements within the burial space. As such we will incorporate the hierarchy of decomposition (rank order disarticulation), associations between regions of anatomical association, areas of disassociation, and the voids produced during decomposition (after Mickleburgh and Wescott, 2018; Mickleburgh et al., 2022) into our narrative. In doing so we acknowledge the tensions between the inductive archaeolothanatological narrative-driven approach (e.g. Duday, 2005 & 2009) versus robust decomposition data derived from human forensic taphonomic experimentation recently articulated by Schotsmans and colleagues (2022) - noting that we will highlight comparative data based on forensic experimental casework and actualistic modelling over inductive intuitive approaches which come with significant evidential shortcomings (Bristow et al. 2011).

      Finally, from a taphonomic perspective it is worth pointing out to reviewers that we have already addressed the issue of lack of taphonomic evidence for carnivore involvement in the formation of the Dinaledi assemblage (Dirks, et al., 2016). Absence of any carnivore-induced bone surface modifications, patterns of skeletal part representation, and a total absence of any carnivore remains found within the Dinaledi chamber (following Kuhn and colleagues, 2010) lead us to reject carnivores as possible vectors of body accumulation within the Dinaledi Chamber and Hill Antechamber.

      Reviewers suggest that without a date derived from geochronological methods, the engravings cannot be associated with H. naledi, and that it is possible (or probable) that the engravings were done in the recent past by H. sapiens. This suggestion neglects the context of the site. We have previously documented the structure and extremely limited accessibility of the Dinaledi subsystem. This subsystem was not recorded on maps of the documented Rising Star Cave system prior to our work and its discovery by our teams. Furthermore, there is no evidence of prehistoric human activity in the areas of the cave related to possible subterranean entrances There is no evidence that humans in the past typically ventured into such extreme spaces like those of Rising Star. It is clear from the presence of the remains of many individuals that H. naledi ventured into these spaces again and again. It is likely that H. naledi moved through these spaces more easily than humans do based on their physique. We show that the engravings overlay each other suggesting multiple engraving events.  These engravings took time and effort and the only evidence for use of the Dinaledi subsystem by any hominin is by H. naledi. The context leads to the null hypothesis that H. naledi made the marks. In our revision, we will elaborate on this argument to clarify the evidence for our stance on this hypothesis. Several reviewers took issue with the title of the engraving paper as we did not insert a qualifier in front of the suggested date range for the engravings. We deliberately left out qualifying language so that the title took the form of a testable hypothesis rather than a weak assertation. Should future work find the engravings were not produced within this time range, then we will restate this hypothesis.

      Finally, with regards to the engravings we have chosen to report them because they exist. Not reporting the presence of engraved marks on the walls of a cave above hypothesized burials would be tantamount to leaving relevant evidence out of the description of an archeological context. We recognize and state in our manuscript that these markings require substantial further study, including attempts at geochronological dating. But the current evidence is clearly relevant to the archaeological context of the subsystem. We take a similar stance with reporting the presence of the tool shaped artefact near the hand of the H. naledi skeleton in the Hill Antechamber. It is evident that this object requires further study, as we stated in our manuscript, but again omitting it from our study would be leaving out relevant evidence.

      Some have suggested that the null hypothesis should be that all of these observed circumstances are of natural origin. Our team took this approach in our early investigation of the Dinaledi subsystem (Dirks et al. 2015). We adopted the null hypothesis that the geological processes involved in the accumulation of H. naledi skeletal remains were “natural” (e.g., non-naledigenic involvement), and we were able to reject many alternative explanations for the assemblage, including carnivore accumulation, “death trap” accumulation, and fluvial transport of bodies or bones (Dirks et al. 2015). This led us to the hypothesis that H. naledi were involved in bringing the bodies into the spaces where they were found. But we did not hypothesize their involvement in the formation of the deposit itself beyond bringing the bodies to the location.

      This approach seems conservative. It followed the traditional view that small-brained hominins do not engage in cultural practices. But we recognize in hindsight that this null hypothesis approach did harm to our analyses. It impeded us from recognizing within our initial excavations of the puzzle box area and other excavations between 2014 – 2017 that we might be encountering remains that were intrusive in the sedimentary floor of the chamber. If we had approached the accumulation of a large number of hominins from the perspective of the null hypothesis being that the situation was likely cultural, we perhaps would have collected evidence in a slightly different manner. We certainly note that if the Dinaledi system had been full of the remains of modern humans, there would have been little doubt that the null hypothesis would have been that this was a cultural space and not a “natural space”.  We therefore respectfully disagree with the reviewers who continue to support the idea that we should approach hominin excavations with the null hypothesis that they will be natural (specifically non-cultural) in origins. If excavations continue with this mindset we believe that potential cultural evidence is almost certain to be lost.

      There has been a gradient across paleoanthropological excavations, archaeological work, and forensic investigation, with increasing precision of context. The reality is that the recording precision and frame of approach is typically different in most paleontological excavations than in those related to contemporary human remains. If anything comes from the present discussion of whether the Dinaledi system is a burial site for H. naledi or not, we hope that by taking seriously the possibility of deep cultural dynamics of hominins, we will encourage other teams to meet the highest standards of excavation in order to preserve potential cultural evidence. Given H. naledi’s cranial capacity we suggest that even very early hominin skeletal assemblages should be re-examined, if there is sufficient evidence or records available.  These would include examples such as the A.L. 333 Au. afarensis site (the so called First Family site in Hadar Ethiopia), the Dikika infant skeleton, WT 15000 (Turkana Boy) and even A.L. 288 (Lucy) as such unusual taphonomic situations where skeletons are preserved cannot be simply explained away as “natural” in origin, based solely on the cranial capacity and assumed lack of cognitive and cultural complexity of the hominins as emphasized by us in Fuentes et al. (2023). We are not the first to observe that some very early hominin situations may represent early mortuary activity (Pettitt 2013), but we would advocate a step further. We suggest it may be damaging to take “natural accumulation” as the standard null hypothesis for hominin paleoanthropology, and that it is more conservative in practice to engage remains with the null hypothesis of possible cultural formation.

      We are deeply grateful for the time and effort all of the 8 reviewers (across three reviews) have taken with this work.  We also acknowledge the anonymous reviewers from previous submissions who’s opinions and comments will have made the final iterations of these manuscripts better for their efforts. As this process is rather public and includes commentary outside of the eLife forum, we ask that the efforts of all 37 authors and 8 reviewers involved be respected and that the discourse remain professional in all venues as we study this fascinating and quite complex occurrence. We appreciate also the efforts of members of the public who have engaged with this relatively new process where preprints are posted prior to the reviews allowing comments and interactions from colleagues and the public who are normally not part of the internal peer review process.  We believe these interactions will make for better final papers. We feel we have met the standards of demonstrating burials in H. naledi and that the engraving are most likely associated with H. naledi. However, given the reviews we see many areas where our clarity and context, and analyses, were less strong than they can be. With the clarifications and additions taken on board through these review processes the final papers will be stronger and clearer. We, recognize that this is an ongoing process of scientific investigation and further work will allow continued, and possibly better, evaluation of these hypothesis and others.

      Lee R Berger, Agustín Fuentes, John Hawks, Tebogo Makhubela

      Works cited:

      • Aspöck, E. (2008). What Actually is a ‘Deviant Burial’?: Comparing German-Language and Anglophone Research on ‘Deviant Burials.’ In E. M. Murphy (Ed.). Deviant Burial in the Archaeological Record. Oxford: Oxbow Books.  pp 17–34.

      • Bolliger, S.A. & Thali, M.J. (2009). Thanatology. In S.A. Bolliger and M.J. Thali (eds) Virtopsy Approach:  3D Optical and Radiological Scanning and Reconstruction in Forensic Medicine. Boca Raton: CRC Press. pp 187-218.

      • Boulestin, B. & Duday, H. (2005). Ethnologie et archéologie de la mort: de l’illusion des références à l’emploi d’un vocabulaire. In: C. Mordant and G. Depierre (eds) Les Pratiques Funéraires à l’Âge du Bronze en France. Actes de la table ronde de Sens-en-Bourgogne. Paris: Éditions du Comité des Travaux Historiques et Scientifiques. pp. 17–30.

      • Boulestin, B. & Duday, H. (2006). Ethnology and archaeology of death: from the illusion of references to the use of a terminology. Archaeologia Polona 44: 149–169.

      • Bristow, J., Simms, Z. & Randolph-Quinney, P.S. Taphonomy. In S. Black and E. Ferguson (eds.) Forensic Anthropology 2000-2010. Boca Raton, FL: CRC Press. pp 279-318.

      • Channing, J. & Randolph-Quinney, P.S. (2006). Death, decay and reconstruction: the archaeology of Ballykilmore Cemetery, County Westmeath. In J. O’Sullivan and M. Stanley (eds.) Settlement, Industry and Ritual: Archaeology. National Roads Authority Monograph Series No. 3. Dublin: NRA/Four Courts Press. pp 113-126.

      • Cherryson, A. K. (2008). Normal, Deviant and Atypical: Burial Variation in Late Saxon Wessex, c. AD 700–1100. In E. M. Murphy (Ed.). Deviant Burial in the Archaeological Record. Oxford: Oxbow Books. pp 115–130.

      • Connolly, M., F. Coyne & L. G. Lynch (2005). Underworld : Death and Burial in Cloghermore Cave, Co. Kerry. Bray, Co. Wicklow: Wordwell.

      • Darwent, C. M. & R. L. Lyman (2002). Detecting  the postburial fragmentation of carpals, tarsals and phalanges. In M. H. Sorg and W. D. Haglund (eds). Advances in Forensic Taphonomy: Method, Theory and Archeological Perspectives. Boca Raton, FL, CRC Press. pp 355-378.

      • d’Errico, F., & Backwell, L. (2016). Earliest evidence of personal ornaments associated with burial: The Conus shells from Border Cave. Journal of Human Evolution, 93, 91–108.

      • De Villiers. H. (1973). Human skeletal remains from Border Cave, Ingwavuma District, KwaZulu, South Africa. Annals of the Transvaal Museum, 28(13), 229–246.

      • Dell’Unto, N. and Landeschi, G. (2022). Archaeological 3D GIS. London: Routledge.

      • Dibble, H. L., Aldeias, V., Goldberg, P., McPherron, S. P., Sandgathe, D., & Steele, T. E. (2015). A critical look at evidence from La Chapelle-aux-Saints supporting an intentional Neandertal burial. Journal of Archaeological Science, 53, 649–657.

      • Dirkmaat, D. C., & Cabo, L. L. (2016). Forensic archaeology and forensic taphonomy: basic considerations on how to properly process and interpret the outdoor forensic scene_. Academic Forensic Pathology_ 6, 439–454.

      • Dirks, P. H., Berger, L. R., Roberts, E. M., Kramers, J. D., Hawks, J., Randolph-Quinney, P. S., Elliott, M., Musiba, C. M., Churchill, S. E., de Ruiter, D. J., Schmid, P., Backwell, L. R., Belyanin, G. A., Boshoff, P., Hunter, K. L., Feuerriegel, E. M., Gurtov, A., Harrison, J. du G., Hunter, R., … Tucker, S. (2015). Geological and taphonomic context for the new hominin species Homo naledi from the Dinaledi Chamber, South Africa. ELife, 4, e09561.

      • Dirks, P.H.G.M., Berger, L.R., Hawks, J., Randolph-Quinney, P.S., Backwell, L.R., and Roberts, E.M. (2016). Comment on “Deliberate body disposal by hominins in the Dinaledi Chamber, Cradle of Humankind, South Africa?” [J. Hum. Evol. 96 (2016) 145-148]. Journal of Human Evolution 96:  149-153.

      • Dirks, P. H., Roberts, E. M., Hilbert-Wolf, H., Kramers, J. D., Hawks, J., Dosseto, A., Duval, M., Elliott, M., Evans, M., Grün, R., Hellstrom, J., Herries, A. I., Joannes-Boyau, R., Makhubela, T. V., Placzek, C. J., Robbins, J., Spandler, C., Wiersma, J., Woodhead, J., & Berger, L. R. (2017). The age of Homo naledi and associated sediments in the Rising Star Cave, South Africa. ELife, 6, e24231.

      • Donnelly, S., C. Donnelly & E. Murphy (1999). The forgotten dead: The cíllíní and disused burial grounds of Ballintoy, County Antrim. Ulster Journal of Archaeology 58, 109-113.

      • Duday, H. (2005). L’archéothanatologie ou l’archéologie de la mort. In: O. Dutour, J.-J. Hublin and B. Vandermeersch (eds) Objets et Méthodes en Paléoanthropologie. Paris: Comité des Travaux Historiques et Scientifiques. pp. 153–215.

      • Duday, H. (2009). Archaeology of the Dead: Lectures in Archaeothanatology. Oxford: Oxbow Books.

      • Finley, N. (2000). Outside of life: Traditions of infant burial in Ireland from cillin to cist.  World Archaeology 31, 407-422.

      • Gargett, R. H. (1999). Middle Palaeolithic burial is not a dead issue: The view from Qafzeh, Saint-Césaire, Kebara, Amud, and Dederiyeh. Journal of Human Evolution, 37(1), 27–90.

      • Goldberg, P., Aldeias, V., Dibble, H., McPherron, S., Sandgathe, D., & Turq, A. (2017). Testing the Roc de Marsal Neandertal “Burial” with Geoarchaeology. Archaeological and Anthropological Sciences, 9(6), 1005–1015.

      • Gómez-Olivencia, A., & García-Martínez, D. (2019). New postcranial remains from the Roc de Marsal Neandertal child. PALEO. Revue d’archéologie Préhistorique, 30–1, 30–1.

      • Green, E.C. (2022). An archaeothanatological approach to the identification of late Anglo-Saxon burials in wooden containers. In C.J. Knüsel and E.M.J. Schotsmans (eds.) The Routledge Handbook of Archaeothanatology. London: Routledge. pp 436-455.

      • Henderson, J. (1987). Factors determining the state of preservation of human remains. In A. Boddington, A. Garland and R. Janaway (eds). Death, Decay and Reconstruction: Approaches to Archaeology and Forensic Science. Manchester: Manchester University Press. pp 43-54.

      • Hunter, J. R. (2014). Human remains recovery: archaeological and forensic perspectives. In C. Smith (ed). Encyclopedia of Global Archaeology. New York: Springer New York. pp 3549-3556.

      • Hochrein, M. (2002). An Autopsy of the Grave: Recognizing, Collecting and Preserving Forensic Geotaphonomic Evidence. In M. H. Sorg and W. D. Haglund (eds). Advances in Forensic Taphonomy: Method, Theory and Archeological Perspectives. Boca Raton, FL, CRC Press: 45-70.

      • Knüsel, C.K. & Robb, J. (2016). Funerary taphonomy: An overview of goals and methods. Journal of Archaeological Science: Reports 10, 655-673.

      • Kuhn, B.F., Berger, L.R. & Skinner, J.D. (2010). Examining criteria for identifying and differentiating fossil faunal assemblages accumulated by hyenas and hominins using extant hyenid accumulations. International Journal of Osteoarchaeology 20, 15-35.

      • Lyman, R. (1994). Vertebrate Taphonomy. Cambridge, Cambridge University Press.

      • Martinón-Torres, M., d’Errico, F., Santos, E., Álvaro Gallo, A., Amano, N., Archer, W., Armitage, S. J., Arsuaga, J. L., Bermúdez de Castro, J. M., Blinkhorn, J., Crowther, A., Douka, K., Dubernet, S., Faulkner, P., Fernández-Colón, P., Kourampas, N., González García, J., Larreina, D., Le Bourdonnec, F.-X., … Petraglia, M. D. (2021). Earliest known human burial in Africa. Nature, 593(7857), 7857.

      • Mickleburgh, H.L & Wescott, D.J. (2018). Controlled experimental observations on joint disarticulation and bone displacement of a human body in an open pit: implications for funerary archaeology. Journal of Archaeological Science: Reports 20: 158-167.

      • Mickleburgh, H.L., Wescott, D.J., Gluschitz, S. & Klinkenberg, V.M. (2022). Exploring the use of actualistic forensic taphonomy in the study of (forensic) archaeological human burials: An actualistic experimental research programme at the Forensic Anthropology Center at Texas State University (FACTS), San Marcos, Texas. In C.J. Knüsel and E.M.J. Schotsmans (eds.) The Routledge Handbook of Archaeothanatology. London: Routledge. pp 542-562.

      • Owsley, D. & B. Compton (1997). Preservation in late 19th Century iron coffin burials. In W. Haglund and M. Sorg (eds). Forensic Taphonomy: The Postmortem Fate of Human Remains. Boca Raton, FL, CRC Press: 511-526.

      • Parker Pearson, M. (1999). The Archaeology of Death and Burial. College Station: Texas A&M University Press.

      • Pettitt, P. (2013). The Palaeolithic Origins of Human Burial. Routledge.

      • Pomeroy, E., Bennett, P., Hunt, C. O., Reynolds, T., Farr, L., Frouin, M., Holman, J., Lane, R., French, C., & Barker, G. (2020). New Neanderthal remains associated with the ‘flower burial’ at Shanidar Cave. Antiquity, 94(373), 11–26.

      • Randolph-Quinney, P.S. (2013). From the cradle to the grave: the bioarchaeology of Clonfad 3 and Ballykilmore 6. In N. Brady, P. Stevens and J. Channing (eds.). Settlement and Community in the Fir Tulach Kingdom. Dublin: National Roads Authority Press. pp A2.1-48.

      • Randolph-Quinney, P.S., Haines, S. and Kruger, A. (2018). The use of three-dimensional scanning and surface capture methods in recording forensic taphonomic traces: issues of technology, visualisation, and validation. In: W.J. M. Groen and P. M. Barone (eds). Multidisciplinary Approaches to Forensic Archaeology. Berlin: Springer International Publishing, pp. 115-130.

      • Rendu, W., Beauval, C., Crevecoeur, I., Bayle, P., Balzeau, A., Bismuth, T., Bourguignon, L., Delfour, G., Faivre, J.-P., Lacrampe-Cuyaubère, F., Tavormina, C., Todisco, D., Turq, A., & Maureille, B. (2014). Evidence supporting an intentional Neandertal burial at La Chapelle-aux-Saints. Proceedings of the National Academy of Sciences, 111(1), 81–86.

      • Sandgathe, D. M., Dibble, H. L., Goldberg, P., & McPherron, S. P. (2011). The Roc de Marsal Neandertal child: A reassessment of its status as a deliberate burial. Journal of Human Evolution, 61(3), 243–253.

      • Silver, M. (2016). Conservation Techniques in Cultural Heritage. In E. Stylianidis and F. Remondino (eds) 3D Recording, Documentation and Management of Cultural Heritage. Dunbeath: Whittles Publishing. pp 15-106.

      • Schotsmans, E.M.J., Georges-Zimmermann, P., Ueland, M. and Dent, B.B. (2022). From flesh to bone: Building bridges between taphonomy, archaeothanatology and forensic science for a better understanding of mortuary practices. In C.J. Knüsel and E.M.J. Schotsmans (eds.) The Routledge Handbook of Archaeothanatology. London: Routledge. pp 501-541.

    1. Author Response:

      We thank eLife and the reviewer for the nice summary of our manuscript. We largely agree with the summary and review, and just add a few small points.

      First, the review asks about the reproducibility of our findings, and suggests that they are only from a single experiment. In fact, our manuscript reports data from two independent single-cell experiments: one performed at low multiplicity of infection (MOI), and another at higher MOI. The broad trends, including the lack of strong correlations between viral mRNA transcription and progeny production, are consistent across both experiments.

      Second, the reviewer asks about what happens when two different virions bearing the same viral barcode infect two different cells, given that we estimate 4-8% of barcodes to be shared between multiple infecting virions. When two cells are infected by different virions with the same barcode, this breaks the one-to-one link between transcription in that cell and progeny in the supernatant, since it is not possible to determine which cell contributed the progeny with that barcode. This means that between 4-8% of the points on our correlation plots could be affected by this factor, meaning that a few outliers should be expected. Another scenario, where a single cell is infected by two barcodes, is not problematic for our method because we can simply sum the progeny output for both barcodes from that cell.

      Finally, the reviewer notes that some cells appear to produce progeny virions despite failing to express one or more viral genes. Such cells can be explained in one of two ways. First, as noted immediately above, we expect a small fraction (4-8%) of the points to be erroneous due to a lack of a guaranteed one-to-one link between cell and progeny for non-unique barcodes. Second, in some cases the missing viral gene could be a technical artifact caused by a stochastic failure to capture modestly expressed transcripts from the gene; this phenomenon, known as gene dropout, occurs at a fairly high rate in single-cell experiments (see Qiu Nature Communications 2020 for a detailed discussion). Genes that are expressed at lower levels, like the Influenza virus polymerase genes, are more likely to be missed during single-cell RNA sequencing. The absent viral genes in each infected cell can be explored in detail using the interactive plots at https://jbloomlab.github.io/barcoded_flu_pdmH1N1/

    2. Public Review:

      In this article, a novel technique allowing the linking of viral transcription levels and progeny virion production is presented. Barcoded libraries of an H1N1 influenza virus (two genes were barcoded near the 3'end) were used to infect cells using an experimental approach ensuring that, in the low multiplicity of infection condition, each cell is infected by one virion and that nearly every virion has a unique barcode. This allows then, upon single-cell RNA sequencing and sequencing of the supernatants, to infer back the cells that were producing certain barcoded viruses. Assessing detection frequencies of barcodes in the single-cell sequencing and in the sequencing of the supernatants allows us to compare the levels of viral transcription and progeny virion production.

      Observations that viral transcription levels are very heterogenous at the single-cell level are not novel, but reinforce those from previous studies. The major findings of this study are (i) progeny virion production is also very heterogenous, i.e., a few cells produce most of the progeny virions and (ii) there is a poor correlation between viral transcription levels and progeny virion production at the single-cell level.

      Strengths:<br /> The article is very well written, the experimental choices are very well justified and the methods are very detailed, allowing the possibility of reproducing the work performed in this study. The conclusions are very well supported by the data and the limitations of the study and how those might influence the conclusions are also clearly explained. In addition, several experimental caveats, such as PCR cross-overs in next-generation sequencing and cell multiplets in single-cell sequencing, were well accounted for, which is not always the case in studies using these techniques.

      Weaknesses:<br /> It seems that the results presented here are from one single experiment. How reproducible are the results?

      As explained in the article, it is important that nearly every virion has a unique barcode. This was assessed by sequencing the barcodes in the virus libraries. Between 92% to 96% of the barcodes were unique. With this information, it should be possible to assess whether non-unique barcodes were detected in infected cells, and if yes, remove these from the downstream analysis.

      It seems like all the information available in this very rich dataset was not fully exploited. For instance, Figure 5C suggests that cells missing the expression of one viral gene might still be able to produce progeny viruses. It would be interesting to have the information regarding which gene was not expressed in these cells.

      The introduction and discussion are rather short and the article could benefit from expanding them. Additional speculations about viral or cellular factors (e.g. differences in innate immune responses, differences in cell division status) that might explain the observed heterogeneity, both at the viral transcription and viral progeny virus production levels, would be interesting.

    3. eLife assessment

      This important paper reports a novel, compelling method, based on barcoding viral genes and next-generation sequencing, to quantify both viral transcription levels and progeny virus production in influenza virus-infected cells at the single-cell level. The authors show that viral transcription and progeny virus production are unexpectedly poorly correlated, and that cells in which viral RNAs are transcribed at high levels are not necessarily those producing the most progeny virions. Because of its novelty, the study will be of interest to the broader virology community.

    1. Reviewer #3 (Public Review):

      This study explores how condensin and telomere proteins cooperate to facilitate sister chromatid disjunction at chromosome ends during anaphase. Building upon previous results published by the same group (Reyes et al. 2015, Berthezene et al. 2020), the authors demonstrate that condensin is essential for sister telomere disjunction in anaphase in fission yeast. The primary role of condensin appears to be counteracting cohesin, which holds sister telomeres together. Furthermore, condensin is found to be enriched at telomeres, and this enrichment partially relies on Taz1, the principal telomere factor in S. pombe. The loss of Taz1 does not cause an obvious defect in sister telomere disjunction, which prevents drawing strong conclusions about its role in this process.

    2. eLife assessment

      This important study characterises the involvement of condensin complexes in the segregation of telomeres in fission yeast. The authors present solid evidence in support of their claims, employing a diverse range of complementary techniques. This research will be of interest for cell biologists working on chromosome biology and cell division.

    3. Reviewer #1 (Public Review):

      Colin et al demonstrated that condensin is a key factor for the disjunction of sister-telomeres during mitosis and proposed that it is due to that condensin restrains the telomere association of cohesin. The authors first showed that condensin binds telomeres in mitosis evidenced by ChIP-qPCR and calibrated ChIP-seq. They further demonstrated that compromising condensin's activity leads to a failure in the disjunction of telomeres, with convincing cytological and HI-seq evidence. Two telomeric proteins Taz1 and Mit1 were identified to specifically regulate the telomere association of condensin. Deletion of these genes decreased/increased condensin's telomere association and exacerbated/remedied the defected telomere disjunction in a condensin mutant, echoing the role of condensin in telomere disjunction. They proposed that the underlying mechanism is that condensin inhibits cohesin's accumulation at telomeres. However, the evidence for this claim might need to be further strengthened. Nevertheless, this study uncovered a novel role of condensin in the separation of telomeres of sister chromosomes and open a question of how condensin regulates the structure of chromosomal ends.

    4. Reviewer #2 (Public Review):

      This manuscript presents a comprehensive investigation into the role of condensin complexes in telomere segregation in fission yeast. The authors employ chromatin immunoprecipitation analysis to demonstrate the enrichment of condensin at telomeres during anaphase. They then use condensin conditional mutants to confirm that this complex plays a crucial role in sister telomere disjunction as well as the unclustering of telomeric regions from the preceding Rabl configuration. Interestingly, they show that condensin's role in telomere disjunction is unlikely related to catenation removal but rather related to the organization of telomeres in cis and/or the elimination of structural constraints or proteins that hinder separation.

      The authors also investigate the regulation of condensin localization to telomeres and reveal the involvement of the shelterin subunit Taz1 in promoting condensin's association with telomeres while demonstrating that the chromatin remodeler Mit1 prevents excessive loading of condensin onto telomeres. Finally, they show that cohesin acts as a negative regulator of telomere separation, counteracting the positive effects of condensin.

      Overall, the manuscript is well-executed, and the authors provide sufficient supporting evidence for their claims. There are a couple of aspects that arise from this study that when fully elucidated will lead to a mechanistic understanding of important biological processes. For instance, the exact mechanism by which Taz1 affects condensin loading or the mechanistic link between cohesin and condensin, especially in the context of their opposing roles, are exciting prospects for the future and it is possible that future work within the context of telomeres might provide valuable insights into this question.

      Another crucial point emphasized by the manuscript is that the role of condensin in telomere segregation extends beyond facilitating catenation removal.

    1. Reviewer #2 (Public Review):

      The authors have addressed most of the concerns. Yet, I still think the authors should at least mention in the article the residues involved in the intra-pore lipid binding pockets for further experimental validation (not only for those residues involve in disease). This is important because the lipid-like density information usually does not come integrated into the PDB structures, so it is not easily accessible for non-structural biologists. The structural data seems solid, and the MD data supports the notion that the GJC is in a putative close state.

    2. Reviewer #1 (Public Review):

      Gap junctions, formed from connexins, are important in cell communication, allowing ions and small molecules to move directly between cells. While structures of connexins have previously been reported, the structure of Connexin 43, which is the most widely expressed connexin and is important in many physiological processes was not known. Qi et al used cryo-EM to solve the structure of Connexin 43. They then compared this structure to structures of other connexins. Connexin gap junctions are built from two "hemichannels" consisting of hexamers of connexins. Hemichannels from two opposing cells dock together to form a complete channel that allows the movement of molecules between cells. N-terminal helices from each of the 6 subunits of each hemichannel allow control of whether the channels are open or closed. Previously solved structures of Cx26 and Cx46/50 have the N-termini pointing down into the pore of the protein leaving a central pore and so these channels have been considered to be open. The structure that Qi et al observed has the N-termini in a more raised position with a narrower pore through the centre. This led them to speculate whether this was the "closed" form of the protein. They also noted that, if only the protein was considered, there were gaps between the N-terminal helices, but these gaps were filled with lipid-like molecules. They, therefore, speculated that lipids were important in the closure mechanism. To address whether their structure was open or closed with respect to ions they carried out molecular dynamics studies, and demonstrated that under the conditions of the molecular dynamics ions did not traverse the channel when the lipids were present.

      Strengths:

      The high resolution cryo-EM density maps clearly show the structure of the protein with the N-termini in a lateral position and lipid density blocking the gaps between the neighbouring helices. The conformation that they observe when they have solved the structure from protein in detergent is also seen when they reconstitute the protein into nanodiscs, which is ostensibly a more membrane-like environment. They, therefore, would appear to have trapped the protein in a stable conformational state.<br /> The molecular dynamics simulations are consistent with the channel being closed when the lipid is present and raises the possibility of lipids being involved in regulation.<br /> A comparison of this structure with other structures of connexin channels and hemichannels gives another representation of how the N-terminal helix of connexins can variously be involved in the regulation of channel opening.

      Weaknesses:

      While the authors have trapped a relatively stable state of the protein and shown that, under the conditions of their molecular dynamics simulations, ions do not pass through, it is harder to understand whether this is physiologically relevant. Determining this would be beyond the scope of the article. To my knowledge there is no direct evidence that lipids are involved in regulation of connexins in this way, but this is also an interesting area for future exploration. It is also possible that lipids were trapped in the pore during the solubilisation process making it non-physiological. The authors acknowledge this and they describe the structure as a "putative" closed state.

      The positions of the mutations in disease shown in Figure 4 is interesting. However, the authors don't discuss/speculate how any of these mutations could affect the binding of the lipids or the conformational state of the protein.

      It should also be noted that a structure of the same protein has recently been published. This shows a very similar conformation of the N-termini with lipids bound in the same way, despite solubilising in a different detergent.

    3. Author Response:

      The following is the authors’ response to the original reviews.

      Major Revisions:

      1) Although we appreciate this work was carried out independently, it would improve this paper if this structure presented here was compared to the recently published structure of Cx43 (Nat Commun 14, 931 (2023)) with the conclusions including added in the discussion.

      We encourage the readers to read both our study on Cx43 and the one mentioned by the reviewer. However, we believe the optimal format for such a comparison is going to be a more comprehensive review article, which is outside the scope of our study.

      2) Please elaborate on the lipid-binding pockets observed for lipid 1, lipid 2, and the N-lipid/PGL. For example, what are the residues involved in these lipid-protein interactions? Are these residues conserved in other connexin isoforms? Do these lipid-binding pockets match with previous structures, including the recent Cx43 structure? Please clarify what lipid sites are ambiguous due to insufficient resolution.

      Within the scope of our study, we have shown that some of the disease-linked residues are located in close proximity to the lipid sites (Fig. 4b). This suggests a possible role of the lipid sites in diseases associated with Cx43 mutations (and possibly with the mutations in other connexins, as the structures of other connexin channels also feature bound lipids inside the pore region). We feel that a more in-depth comparison will require a careful study, beyond the analysis that we have performed here, and for this reason we would like to reserve such a detailed comparison for our future work (possibly a comprehensive review article on connexin structure and function).

      3) The NT domain and TM2 segments are referred to as the gate region. If there is no strong evidence to support this claim then please use "putative" gate region.

      We have updated the text accordingly, referring to this region as a putative gate region where appropriate.

      4) It is mentioned that there is a reorientation of extracellular loops 1 and 2 after Gap junction formation. Based on their structures, I wonder how this rearrangement alters the channel conduction pathway. For example, Do the electrostatic surface and hydrophobic properties change? Please consider adding further details as this information could be useful to understand why some properties of hemichannels differ from intercellular GJ channels.

      We have updated the Fig. 5 with an illustration of the Cx43 HC surface coloured according to electrostatic potential (to match the same representation of the Cx43 GJC). It is obvious that the rearrangement of the extracellular loops 1 and 2 do not dramatically alter the electrostatic properties of the HC relative to the GJC. A more obvious difference is in the local environment of the ECLs: it is radically different in a “free” HC (exposed to the solvent or to the extracellular space of a cell), compared to the ECL environment in a connexon within a GJC (which is sealed by a docked connexon from the opposite membrane).

      5) Related to the previous point, the pore profile shown in Figure 5C shows that there is a constriction site in the extracellular part with the same diameter as the observed constriction caused by the NT domain. This constriction point seems to be associated with the high energies calculated for Cl-. Please clarify if this constriction is produced by the formation of the GJC or is also present in HC?

      This is the same constriction zone, and the Cl- barriers are further down the channel axis where the electrostatic potential of the protein is negative. We have included a similar calculation for the HC simulation in Fig. 5 (revised Fig. 5f).

      6) Related to the MD simulations shown in Figure 5d: if the voltage is applied across the whole GJC, the free energy under voltage should not be symmetric. Please clarify.

      The symmetry observed in the free energies is due to the fact that the ions enter and exit from the same hemichannel. Only at very high voltages we observe some rare full GJC permeation events, slightly unbalancing the free energy at 500 mV.

      7) The scheme in Figure 6 many needs further editing. The authors propose a putative closed state in which lipids are bound next to the NT, but we suggest it should be made clearer in the figure that this is a putative model, since there is no functional evidence supporting the role of these lipids in the gating/permeation properties of Cx43. Also, please clarify what is meant by a "semi-permeable gate" - a channel that only permeates ions but not molecules?

      We have updated the legend of the figure 6, to clearly reflect that this is a putative model. The “semi-permeable” state of the channel is something that was suggested previously by the authors of the Cx31.3 study, and we refer to that structure in the figure.

      Minor comments:

      1) In the result section there are some statements that currently lack solid experimental support. Please consider editing or moving this text to the discussion section only. A good example of this is the Diseaselinked mutation section, specifically lines 199-206. In another example: in lines, 237-238 authors state that NT can move laterally and vertically, but this idea still requires experimental validation.

      We feel that the original formulations of these portions of the text are appropriate. Disrupting them would interrupt the flow of the manuscript, and we prefer to stay with the original text in this case.

      2) Line 283. "With these structures in mind, we can now establish the existence of several structurally defined gating substates of the connexin channels". Please, tone down this statement. Replace "establish" with "propose" or another more appropriate word.

      We have updated the text as suggested ("propose” instead of “establish”)

      3) Line 313-314. " The presence of such molecules could have important implications for HC or GJC assembly, substrate permeation, and molecular gating". Currently, this entire statement does not have any support. Is there any paper that authors can discuss to suggest with some basis that lipids might have a role in assembly, permeation or gating?

      We feel that this statement is sufficiently careful, conveying a thought that the presence of such molecules could have important implications for various HC- or GJC-related processes. It is not a particularly strong claim and seems to be appropriate in this context.

      4) It seems that the structure shown in panels A and C in Figure 2 are shown in opposite directions, which makes the figure confusing. If needed, please rotate the structure in panel A to show the cytosolic part of the protein as panel C. Also, in the same figure, panels G and F are wrongly labeled. Please correct.

      For Fig. 2a, the angle is very different from anything else we show in the figure, so we would rather keep this as it is now. We have corrected the labelling for Fig. 2g-h.

      5) Check spelling mistakes in the legend of Extended data Fig.2, Extended data Fig.9, and line 243.

      We are grateful to the reviewers for pointing out the typos, which have now been corrected.

      6) The colors for G-L isoforms are not specified in Extended Data Fig.10. Please correct this.

      We updated the figure, removing the PGL label (the correct label is “lipid-N”).

      7) It is not clear what is the difference between PGL and the N-lipid density. Does PGL refers to the lipid-like density observed in nanodiscs, as indicated in Extended Fig. 4 and 10?. Please clarify this issue in the manuscript.

      The labeling has been corrected in like with the revised version of the manuscript (this density element is now referred to as the “lipid-N”).

      8) Page 7 line 234-235 "The pore opening has a solvent-accessible radius of ~6Å (Figure 5c) very close to the effective hydrated radius of K+ (~6.6 Å) and Cl- (~7.2 Å). This makes it the most narrow pore opening...", it should be diameter, not radius.

      We have added a calculation for the HC (new Fig. 5f) and corrected the text as follows (line 234):

      “The pore opening observed in our cryo-EM structures has a solvent-accessible radius of ~3 Å (Figure 2b). This makes it the most narrow pore opening observed for a connexin channel to date (a comparison of the pore openings in the cryo-EM structures of connexin channels is shown in Extended Data Fig. 12). However, the average solvent-accessible radius of the pore during molecular dynamics was ~6 Å (Figure 5c); note that the effective hydrated radius of K+ and Cl- is ~3.3 Å and ~3.6 Å, respectively.”

      And line 277:

      “The average pore radius during the simulations was consistent with that observed in the cryo-EM structure (Fig. 5f).”

    1. Author Response

      Reviewer #2 (Public Review):

      The manuscript by Ma et al, "Two RNA-binding proteins mediate the sorting of miR223 from mitochondria into exosomes" examines the contribution of two RNA-binding proteins on the exosomal loading of miR223. The authors conclude that YBX1 and YBAP1 work in tandem to traffic and load miR223 into the exosome. The manuscript is interesting and potentially impactful. It proposes the following scenario regarding the exosomal loading of miR223: (1) YBAP1 sequesters miR223 in the mitochondria, (2) YBAP1 then transfers miR223 to YBX1, and (3) YBX1 then delivers miR223 into the early endosome for eventual secretion within an exosome. While the authors propose plausible explanations for this phenomenon, they do not specifically test them and no mechanism by which miR223 is shuttled between YBAP1 and YBX1, and the exosome is shown. Thus, the paper is missing critical mechanistic experiments that could have readily tested the speculative conclusions that it makes.

      Comments:

      1) The major limitation of this paper is that it fails to explore the mechanism of any of the major changes it describes. For example, the authors propose that miR223 shuttles from mitochondrially localized YBAP1 to P-body-associated YBX1 to the exosome. This needs to be tested directly and could be easily addressed by showing a transfer of miR223 from YBAP1 to YBX1 to the exosome.

      Testing this idea using fluorescently labeled miR223 would indeed be an ideal experiment. However, miRNA imaging presents challenges. As reviewer 1 pointed out, and we have now confirmed, the atto-647 dye itself localizes to mitochondria. We will continue our efforts to identify a suitable fluorescent label for miR223in order to be in a position to evaluate the temporal relationship between mitochondrial and endosomal miR223.

      2) If YBAP1 retains miR223 in mitochondria, what is the trigger for YBAP1 to release it and pass it off to YBX1? The authors speculate in their discussion that sequestration of mito-miR223 plays a "role in some structural or regulatory process, perhaps essential for mitochondrial homeostasis, controlled by the selective extraction of unwanted miRNA into RNA granules and further by secretion in exosomes...". This is readily testable by altering mitochondria dynamics and/or integrity.

      A previous study has reported that YBAP1 can be released from mitochondria to the cytosol during HSV-1 infection (Song et al., 2021). However, due to restrictions, we are unable to conduct experiments using HSV to verify this condition. We attempted to induce mitochondrial stress by using different concentrations of CCCP, but we did not observe the release of YBAP1 from mitochondria after CCCP treatment. We speculate that not all mitochondrial stress conditions can trigger YBAP1 release. Investigating the mechanism of mito-miR223 release from mitochondria is one of our interests that we aim to explore in future studies.

      3) Much of the miRNA RT-PCR analysis is presented as a ratio of exosomal/cellular. This particular analysis assumes that cellular miRNA is unaffected by treatments. For example, Figure 1a shows that the presence of exosomal miR223 is significantly reduced when YBX1 is knocked out. This analysis does not consider the possibility that YBX1-KO alters (up or down-regulates) intracellular miR223 levels. Should that be the case, the ratiometric analysis is greatly skewed by intracellular miRNA changes. It would be better to not only show the intracellular levels of the miRs but also normalize the miRNA levels to the total amount of RNA isolated or an irrelevant/unchanged miRNA.

      Our previous publications demonstrated that miR223 levels are increased in YBX1-KO cells and decreased in exosomes derived from YBX1 KO cells. However, no significant changes were observed in miR190 levels (Liu et al., 2021; Shurtleff et al., 2016). The repeated data has been included in Figure 1a.

      For the analysis of other miRNAs by RT-PCR, we assessed changes in intracellular and exosomal miRNA levels in the corresponding figures. In the qPCR analysis, miRNA levels were normalized to the total amount of RNA.

      4) In figure 1, the authors show that in YBX1-KO cells, miR223 levels are decreased in the exosome. They further suggest this is because YBX1 binds with high affinity to miR223. This binding is compared to miR190 which the authors state is not enriched in the exosome. However, no data showing that miR190 is not present in the exosome is shown. A figure showing the amount of cellular and exosomal miR223 and 190 should be shown together on the same graph.

      In previous publications we demonstrated that miR190 is not localized in exosomes and not significantly changed in YBX1 knockout (KO) cells and exosomes derived from YBX1 KO cells (Liu et al., 2021; Shurtleff et al., 2016). The repeated data has been included in Figure 1a.

      5) Figure 2 Supplement 1 - As to determine the nucleotides responsible for interacting with YBX1, the authors made several mutations within the miR223 sequence. However, no explanation is given regarding the mutant sequences used or what the ratios mean. Mutant sequences need to be included. How do the authors conclude that UCAGU is important when the locations of the mutations are unclear? Also, the interpretation of this data would benefit from a binding affinity curve as shown in Fig 2C.

      The ratio is of labeled miR223/unlabeled miR223 (wt and mutant). All mutant sequences of miR223 have been included in Figure 2 supplement 1.

      6) While the binding of miR223mut to YBX1 is reduced, there is still significant binding. Does this mean that the 5nt binding motif is not exact? Do the authors know if there are multiple nucleotide possibilities at these positions that could facilitate binding? Perhaps confirming binding "in vivo" via RIP assay would further solidify the UCAGU motif as critical for binding to YBX1.

      The binding affinity of miR223mut with YBX1 is reduced approximately 27-fold compared to miR223. We speculate that the secondary structure of miR223 may contribute to the interaction with YBX1.

      Our EMSA data, in vitro packaging data, and exosome analysis reinforce the conclusion that UCAGU is critical for YBX1 binding. These findings suggest that the presence of the UCAGU motif in miR223 is crucial for its interaction with YBX1 and subsequent sorting into exosomes.

      7) Figures 2g, h - It would be nice to show that miR190mut also packages in the cell-free system. This would confirm that the sequence is responsible. Also, to confirm that the sorting of miR223 is YBX1-dependent, a cell-free reaction using cytosol and membranes from YBX1 KO cells is needed.

      Although we have not performed the suggested experiment, we purified exosomes from cells overexpressing miR190sort and observed an increase in the enrichment of miR190sort in exosomes compared to miR190. This finding confirmed that the UCAGU motif facilitates miRNA sorting into exosomes.

      Regarding the in vitro packaging assay, our previously published paper demonstrated that cytosol from YBX1 knockout (KO) cells significantly reduces the protection of miR223 from RNase digestion. We concluded that the sorting of miR223 into exosomes is dependent on YBX1 (Shurtleff et al., 2016).

      8) In Figure 3a, the authors show that miR223 is mitochondrially localized. Does the sequence of miR223 (WT or Mut) matter for localization? Does it matter for shuttling between YBAP1 and YBX1?

      The localization of miR223mut has not been tested in our current study. We plan to conduct these experiments in the future.

      9) Supplement 3c - Is it strange that miR190 is not localized to any particular compartment? Is miR190 present ubiquitously and equally among all intracellular compartments?

      Most mature miRNAs are predominantly localized in the cytoplasm. Although there is no specific subcellular localization reported for miR190 in the literature, our experimental findings indicate a relatively high expression of miR190 in 293T cells. It is likely that most of miR190 is localized in the cytosol. However, it is also possible that a small fraction of miR190 may associate with a membrane, which could explain its distribution in various subcellular structures. Importantly, we did not observe enrichment of miR190 in the mitochondria or exosomes.

      10) Figure 3h - Why would the miR223 levels increase if you remove mitochondria? Does CCCP also cause miR223 upregulation? I would have thought miR223 would just be mis-localized to the cytosol.

      We report that the levels of cytoplasmic miR223 increase following the removal of mitochondria using CCCP treatment. While we cannot rule out the possibility that upregulation of miR223 is directly caused by CCCP treatment, we suggest that miR223 becomes mis-localized to the cytosol upon mitochondrial removal. Our data suggests that mitochondria contribute to the secretion of miR223 into exosomes. When mitochondria are removed by mitophagy, cytosolic miR223 is not efficiently secreted, which provides an alternative explanation for the observed increase in miR223 level after mitochondrial removal.

      11) Figure 3i - What is the meaning of "Urd" in the figure label? This isn't mentioned anywhere.

      “Urd” represents Uridine. Uridine is now spelled out in figure 3i. The absence of mitochondria can impact the function of the mitochondrial enzyme dihydroorotate dehydrogenase, which plays a role in pyrimidine synthesis. To address this issue, one approach is to supplement the cell culture medium with Urd. A previous study demonstrated that primary fibroblasts showed positive responses when Urd was added to the cell culture medium, resulting in improved cell viability for extended periods of time (Correia-Melo et al., 2017).

      12) Figure 3j - The data is presented as a ratio of EV/cell. Again, this inaccurately represents the amount of miR223 in the EV. This issue is apparent when looking at Figures 3h and 3j. In 3h, CCCP causes an upregulation of intracellular miR223. As such, the presumed decrease in EV miR233 after CCCP (3j) could be an artifact due to increased levels of intracellular miR223. Both intracellular and EV levels of miRs need to be shown.

      Both the intracellular and exosomal levels of miR223 have been included in Figure 3j.

      13) In Figure 4, the authors show that when overexpressed, YBX1 will pulldown YBAP1. Can the authors comment as to why none of the earlier purifications show this finding (Figure 1 for example)? Even more curious is that when YBAP1 is purified, YBX1 does not co-purify (Figure 4 supplement 1a, b).

      In Figure 4a-b, human YBX1 fused with a Strep II tag was purified from 293T cells using Strep-Tactin® Sepharose® resin in a one-step purification process. Our data has shown that YBAP1 is expressed in 293T cells.

      In Figure 1 and Figure 4 Supplement 1a, human YBX1 or YBAP1 fused with His and MBP tags were purified from insect cells using a three-step purification process involving Ni-NTA His-Pur resin, amylose resin, and Superdex-200 gel filtration chromatography.

      One possibility is that human YBX1 or YBAP1 may not interact well with insect YBAP1 or YBX1, which could result in separate tagged forms of YBX1 or YBAP1 isolated from insect cells.

      Another possibility is that the expression levels of insect YBAP1 and YBX1 may be too low. Consequently, tagged forms YBX1 or YBAP1 expressed in insect cells may copurify with partners not readily detected by Coomassie blue stain. However, in Figure 4 Supplement 1b, human YBX1 fused with His and MBP tags was co-expressed with non-tagged human YBAP1, and both bands of YBX1 and YBAP1 were visible on the Coomassie blue gel after purification using Ni-NTA His-Pur resin, amylose resin, and Superdex-200 gel filtration chromatography.

      14) Figure 4f, g - The text associated with these figures is very confusing, as is the labeling for the input. Also, what is "miR223 Fold change" in this regard? Seeing as your IgG should not have IP'd anything, normalizing to IgG can amplify noise. As such, RIP assays are typically presented as % input or fold enrichment.

      The RIP assay results have been calculated and presented as a % input in Figure 4g.

      15) Figure 4h - The authors show binding between miR223 and YBAP1 however it is not clear how significant this binding is. There is more than a 30-fold difference in binding affinity between miR223 and YBX1 than between miR223 and YBAP1. Even more, when comparing the EMSAs and fraction bound from figures 1 and 2 to those of Figure 4h, the binding between miR223 and YBAP1 more closely resembles that of miR190 and YBX1, which the authors state is a non-binder of YBX1. The authors will need to reconcile these discrepancies.

      We agree that the binding of YBAP and YBX1 differ quite significantly in the affinity of their interaction with miR223. It is difficult to draw conclusions from a comparison of the affinities of YBX1 for miR190 and YBAP1 for miR223. Nonetheless, a quantitative difference in the interaction of YBAP1 with miR223 and miR190 is apparent (Fig. 4 h, I, j) and we observed no enrichment miR190 in isolated mitochondria (Fig. 3 supplement 1a) whereas YBAP1 selectively IP’d miR223 from isolated mitochondria (Fig. 4 f and g).

      16) Can the authors present the Kd values for EMSA data?

      The Kd values for the EMSA data have been added to the respective figures.

      17) Figure 5 - Does YBAP1-KO affect mitochondrial protein integrity or numbers?

      We generated stable cell lines expressing 3xHA-GFP-OMP25 in both 293T WT and YBAP1-KO cells, but we did not observe any alterations in mitochondrial morphology (Author response image 1).

      Author response image 1.

      Additionally, we performed a comparison of different mitochondrial markers using immunoblot in 293T WT cells and YBAP1-KO cells and did not observe any changes in these markers (data has been included in Figure 5b.).

      18) Figure 6a - Are the authors using YBAP1 as their mitochondrial marker? Please include TOM20 and/or 22.

      In Figure 4c and 4e, our data clearly demonstrate that the majority of YBAP1 is localized in the mitochondria.

      To further validate this localization, we performed immunofluorescence staining using antibodies against endogenous Tom20 and YBX1. The immunofluorescence images document YBX1 associated with mitochondria (Author response image 2 and new Fig 6a.).

      Author response image 2.

      19) Figure 6b - Rab5 is an early endosome marker and may not fully represent the organelles that become MVBs. Co-localization at this point does not suggest that associating proteins will be present in the exosome, and it is possible that the authors are looking at the precursor of a recycling endosome. Even more, exosome loading does not occur at the early endosome, but instead at the MVB. Perhaps looking at markers of the late endosome such as Rab7 or ideally markers of the MVB such as M6P or CD63 would help draw an association between YBX1, YBAP1, and the exosome. Also, If the authors want to make the claim that interactions at the early endosome leads to secretion as an exosome, the authors should show that isolated EVs from Rab5Q79L-expressing cells contain miR223.

      We have previously used overexpressed Rab5(Q79L) to monitor the localization of exosomal content, specifically CD63 and YBX1, in enlarged endosomes (Liu et al. 2021, Fig. 4A, B). These endosomes exhibit a mixture of early and late endocytic markers, including CD63. (Wegner et al., 2010). Hence, the presence of Rab5(Q79L)-positive enlarged endosomes does not solely indicate early endosomes.

      20) The mentioning of P-bodies is interesting but at no time is an association addressed. This is therefore an overly speculative conclusion. Either show an association or leave this out of the manuscript.

      In a previous paper we demonstrated that YBX1 puncta colocalize with P-body markers EDC4, Dcp1 and DDX6 (Liu et al., 2021).

      21) In lines 55-58, the authors make the comment "However, many of these studies used sedimentation at ~100,000 g to collect EVs, which may also collect RNP particles not enclosed within membranes which complicates the interpretation of these data." Do RNPs not dissolve when secreted? Can the authors give a reference for this statement?

      In a previous paper, we demonstrated that the RNP Ago2 does not dissolve in the conditioned medium and is not in vesicles but sediments to the bottom of a density gradient (Temoche-Diaz et al., 2019).

    2. eLife assessment

      This is an important study that reports the discovery of a new pathway of miRNA sorting to exosomes, involving a mitochondrially-localized protein. The evidence provided by some of the biochemical data is convincing. However, the major body of evidence is still incomplete.

    3. Reviewer #1 (Public Review):

      This study focuses on molecular and cellular mechanisms underlying the sorting of miRNAs into exosomes originating from multivesicular bodies (MVBs). Following up on their previous work, the authors analysed the biochemical basis of miRNA selection by the RNA-binding protein YBX1 which is known to participate in this sorting. Using electrophoretic mobility shift assays (EMSA) involving a series of YBX1 constructs, they pinpointed the key role of the cold shock domain of YBX1 (supported by the C-terminal domain) in miRNA binding. By comparing a secreted model miRNA (miR223), a control cytoplasmic miRNA that is not enriched in exosomes (miR190), and a series of their swap mutants, the authors identified what could be a sequence motif enabling YBX1 to discriminate - through direct binding - between miRNAs to be secreted or to be retained.

      The authors then wondered from which subcellular pool miR223 could be mobilised for secretion. They turned their attention to the mitochondria and found evidence of miR223 association with these organelles. Interestingly, when mitochondria were depleted by Parkin overexpression and CCCP treatment, the cellular level of miR223, but not of miR190, increased, whereas its enrichment in extracellular vesicles dropped. This observation permitted to forward a hypothesis whereby mitochondria could be involved in miR223 mobilisation into exosomes. This process would be mediated by YBX1 which shuttles between mitochondria and endosomes, as was elegantly shown in live imaging experiments.

      Finally, the authors provide initial data implicating in this process the mitochondrial matrix protein YBAP1, broadly known as C1QBP, or p32. YBAP1 was found to interact with YBX1 and miR223 in pull-down assays. Moreover, direct and moderately strong miR223 binding by YBAP1 was confirmed by EMSA. Interestingly, just like YBX1, YBAP1 seems to prefer this substrate over miR190, indicating certain binding specificity. The observation that YBAP1 knockout resulted in the decreased association of miR223 with mitochondria, paralleled by its correspondingly better mobilisation into exosomes, enabled the authors to propose that YBAP1 could negatively control miR223 secretion at the level of mitochondria.

      Strengths

      This is a very interesting study proposing an elegant hypothesis and featuring a creative panel of methods, many of which will certainly be of interest to biochemists and cell biologists working with extracellular RNA and mitochondria (e.g. the Parkin/CCCP-mediated mitochondria depletion and the time-lapse imaging of RNA-binding proteins against cellular organelles).

      The authors did a good job of dissecting the YBX1 interaction with miR223 versus miR190. These experiments are performed at a high technical level, and their interpretation is straightforward and convincing. The nearly two orders of magnitude difference in affinity provides a plausible means by which YBX1 could recognise and funnel one, but not the other, miRNA into the secretion pathway.<br /> Another valuable piece of data is related to YBAP1. This important, deeply conserved protein, strongly implicated in severe mitochondrial diseases and cancer, remains poorly understood at the level of basic molecular mechanisms, and even its subcellular localisation is debated. The data presented by the authors reinforce the idea of its primarily mitochondrial localisation, in agreement with earlier studies. They also provided new information about the RNA-binding activity of YBAP1. First proposed to interact with RNA by Yagi et al., Nucleic Acids Res 2012 (doi:10.1093/nar/gks774), YBAP1 is confirmed in the present study as a reasonably affine RNA-binding protein, based on direct EMSA experiments involving a highly purified protein and natural RNAs. These data should encourage the community to explore the full RNA-binding potential of YBAP1/C1QBP/p32 in a wider variety of models, especially in the context of mitochondrial gene expression.

      Weaknesses

      While the authors might be right about the existence of a sequence motif that specifies miRNAs for exosome sorting by YBX1, it is at present difficult to disentangle the sequence and structure contributions to YBX1 binding within the variants described in the paper. RNA structure predictions, however imperfect, suggest that miR223-3p is a fully single-stranded transcript (ensemble ΔG = -0.33 kcal/mol, RNAfold), while miR190-5p is a tightly base-paired one (ensemble ΔG = -2.85 kcal/mol). This likely explains the differential affinity to YBX1, known to strongly prefer single-stranded RNAs. When mutating the putative sorting motif in miR223 (UCAGU>AGACA), the authors introduced some amount of secondary structure (ΔG = -1.04 kcal/mol), which could have impeded YBX1 binding. By contrast, the mutation of miR190 (AUAUG>UCAGU) significantly weakened the structure (ensemble ΔG = -2.21 kcal/mol), which might explain the improvement in YBX1 interaction.

      Mitochondria appear to be a plausible location for mobilisable RNAs, given their multiple associations with ribosomes, RNA-containing condensates, and other organelles. However, the presented evidence of the mitochondrial localisation of miR223 is limited. The colocalisation pattern of the ATTO 647-labelled miR223 with the well-behaved mitochondrial marker Tom22 is remarkable; such a neat overlap has so far only been observed for some abundant mtDNA-encoded transcripts, but not for an extraneous transcript. The interpretation of this result will depend a great deal on experimental details which, unfortunately, are missing for this section. ATTO 647N is known to be quantitatively recruited to mitochondria, producing just the same kind of complete colocalisation, making it a perfect tool to visualise mitochondria in the cell (Han et al., Nat Commun 2017, doi:10.1038/s41467-017-01503-6). There is a worry that the colocalisation observed here might have been driven by the dye alone.

      Furthermore, the definition of the topology of RNA localisation with respect to the mitochondrial membranes remains challenging, and a number of more robust methods have been recently proposed to address this contentious issue. At minima, one would expect that the authors would use RNase treatment, with or without Triton X-100 (like they did in the in vitro packaging assay), to see whether miR223 is indeed protected by the mitochondrial membranes and, therefore, resides in the interior of the organelles. As for now, based on the presented data, one can safely conclude that miR223 is associated with the mitochondria, without claiming that it is necessary inside them.

      The Parkin/CCCP method is very powerful, which is its strength and weakness at the same time. miR223 secretion does decrease when the mitochondria are depleted. However, it is unclear how direct and specific this effect is. The destruction of mitochondria likely crashed the cellular ATP levels, which could have generally affected vesicular transport, not only miR223 sorting. A more detailed analysis of the overall abundance of extracellular vesicles and their cargo under these conditions could reveal the true scope of the mitochondrial contribution to RNA secretion.

      YBAP1 is a difficult, indeed "treacherous", protein to work with. Its strong negative charge (pI = 4) makes it easily stick to positively charged proteins, such as YBX1 (pI = 9.9). Such interactions are routinely observed in pulldown assays from cell lysates, where all components are intermixed (but often cannot be corroborated by in situ or in vivo approaches). The authors carefully showed that YBX1 and YBAP1 do not significantly colocalise in the cell, which makes the interplay between the two proteins in miR223 sorting difficult to stage. They also studied the miR223 distribution between mitochondria and extracellular vesicles using YBAP1 knockout cells. However, such cells are known to be very sick and have an extremely pleiotropic mitochondrial and metabolic phenotype. Therefore, the apparent implication of YBAP1 in miR223 sorting might be less direct than currently envisaged.

    4. Reviewer #2 (Public Review):

      The manuscript by Ma et al, "Two RNA-binding proteins mediate the sorting of miR223 from mitochondria into exosomes" examines the contribution of two RNA-binding proteins on the exosomal loading of miR223. The authors conclude that YBX1 and YBAP1 work in tandem to traffic and load miR223 into the exosome. The manuscript is interesting and potentially impactful. It proposes the following scenario regarding the exosomal loading of miR223: (1) YBAP1 sequesters miR223 in the mitochondria, (2) YBAP1 then transfers miR223 to YBX1, and (3) YBX1 then delivers miR223 into the early endosome for eventual secretion within an exosome. While the authors propose plausible explanations for this phenomenon, they do not specifically test them and no mechanism by which miR223 is shuttled between YBAP1 and YBX1, and the exosome is shown. Thus, the paper is missing critical mechanistic experiments that could have readily tested the speculative conclusions that it makes.

      Comments:<br /> 1. The major limitation of this paper is that it fails to explore the mechanism of any of the major changes it describes. For example, the authors propose that miR223 shuttles from mitochondrially localized YBAP1 to P-body-associated YBX1 to the exosome. This needs to be tested directly and could be easily addressed by showing a transfer of miR223 from YBAP1 to YBX1 to the exosome.<br /> 2. If YBAP1 retains miR223 in mitochondria, what is the trigger for YBAP1 to release it and pass it off to YBX1? The authors speculate in their discussion that sequestration of mito-miR223 plays a "role in some structural or regulatory process, perhaps essential for mitochondrial homeostasis, controlled by the selective extraction of unwanted miRNA into RNA granules and further by secretion in exosomes...". This is readily testable by altering mitochondria dynamics and/or integrity.<br /> 3. Much of the miRNA RT-PCR analysis is presented as a ratio of exosomal/cellular. This particular analysis assumes that cellular miRNA is unaffected by treatments. For example, Figure 1a shows that the presence of exosomal miR223 is significantly reduced when YBX1 is knocked out. This analysis does not consider the possibility that YBX1-KO alters (up or down-regulates) intracellular miR223 levels. Should that be the case, the ratiometric analysis is greatly skewed by intracellular miRNA changes. It would be better to not only show the intracellular levels of the miRs but also normalize the miRNA levels to the total amount of RNA isolated or an irrelevant/unchanged miRNA.<br /> 4. In figure 1, the authors show that in YBX1-KO cells, miR223 levels are decreased in the exosome. They further suggest this is because YBX1 binds with high affinity to miR223. This binding is compared to miR190 which the authors state is not enriched in the exosome. However, no data showing that miR190 is not present in the exosome is shown. A figure showing the amount of cellular and exosomal miR223 and 190 should be shown together on the same graph.<br /> 5. Figure 2 Supplement 1 - As to determine the nucleotides responsible for interacting with YBX1, the authors made several mutations within the miR223 sequence. However, no explanation is given regarding the mutant sequences used or what the ratios mean. Mutant sequences need to be included. How do the authors conclude that UCAGU is important when the locations of the mutations are unclear? Also, the interpretation of this data would benefit from a binding affinity curve as shown in Fig 2C.<br /> 6. While the binding of miR223mut to YBX1 is reduced, there is still significant binding. Does this mean that the 5nt binding motif is not exact? Do the authors know if there are multiple nucleotide possibilities at these positions that could facilitate binding? Perhaps confirming binding "in vivo" via RIP assay would further solidify the UCAGU motif as critical for binding to YBX1.<br /> 7. Figures 2g, h - It would be nice to show that miR190mut also packages in the cell-free system. This would confirm that the sequence is responsible. Also, to confirm that the sorting of miR223 is YBX1-dependent, a cell-free reaction using cytosol and membranes from YBX1 KO cells is needed.<br /> 8. In Figure 3a, the authors show that miR223 is mitochondrially localized. Does the sequence of miR223 (WT or Mut) matter for localization? Does it matter for shuttling between YBAP1 and YBX1?<br /> 9. Supplement 3c - Is it strange that miR190 is not localized to any particular compartment? Is miR190 present ubiquitously and equally among all intracellular compartments?<br /> 10. Figure 3h - Why would the miR223 levels increase if you remove mitochondria? Does CCCP also cause miR223 upregulation? I would have thought miR223 would just be mis-localized to the cytosol.<br /> 11. Figure 3i - What is the meaning of "Urd" in the figure label? This isn't mentioned anywhere.<br /> 12. Figure 3j - The data is presented as a ratio of EV/cell. Again, this inaccurately represents the amount of miR223 in the EV. This issue is apparent when looking at Figures 3h and 3j. In 3h, CCCP causes an upregulation of intracellular miR223. As such, the presumed decrease in EV miR233 after CCCP (3j) could be an artifact due to increased levels of intracellular miR223. Both intracellular and EV levels of miRs need to be shown.<br /> 13. In Figure 4, the authors show that when overexpressed, YBX1 will pulldown YBAP1. Can the authors comment as to why none of the earlier purifications show this finding (Figure 1 for example)? Even more curious is that when YBAP1 is purified, YBX1 does not co-purify (Figure 4 supplement 1a, b).<br /> 14. Figure 4f, g - The text associated with these figures is very confusing, as is the labeling for the input. Also, what is "miR223 Fold change" in this regard? Seeing as your IgG should not have IP'd anything, normalizing to IgG can amplify noise. As such, RIP assays are typically presented as % input or fold enrichment.<br /> 15. Figure 4h - The authors show binding between miR223 and YBAP1 however it is not clear how significant this binding is. There is more than a 30-fold difference in binding affinity between miR223 and YBX1 than between miR223 and YBAP1. Even more, when comparing the EMSAs and fraction bound from figures 1 and 2 to those of Figure 4h, the binding between miR223 and YBAP1 more closely resembles that of miR190 and YBX1, which the authors state is a non-binder of YBX1. The authors will need to reconcile these discrepancies.<br /> 16. Can the authors present the Kd values for EMSA data?<br /> 17. Figure 5 - Does YBAP1-KO affect mitochondrial protein integrity or numbers?<br /> 18. Figure 6a - Are the authors using YBAP1 as their mitochondrial marker? Please include TOM20 and/or 22.<br /> 19. Figure 6b - Rab5 is an early endosome marker and may not fully represent the organelles that become MVBs. Co-localization at this point does not suggest that associating proteins will be present in the exosome, and it is possible that the authors are looking at the precursor of a recycling endosome. Even more, exosome loading does not occur at the early endosome, but instead at the MVB. Perhaps looking at markers of the late endosome such as Rab7 or ideally markers of the MVB such as M6P or CD63 would help draw an association between YBX1, YBAP1, and the exosome. Also, If the authors want to make the claim that interactions at the early endosome leads to secretion as an exosome, the authors should show that isolated EVs from Rab5Q79L-expressing cells contain miR223.<br /> 20. The mentioning of P-bodies is interesting but at no time is an association addressed. This is therefore an overly speculative conclusion. Either show an association or leave this out of the manuscript.<br /> 21. In lines 55-58, the authors make the comment "However, many of these studies used sedimentation at ~100,000 g to collect EVs, which may also collect RNP particles not enclosed within membranes which complicates the interpretation of these data." Do RNPs not dissolve when secreted? Can the authors give a reference for this statement?

    5. Reviewer #3 (Public Review):

      The article by Ma et al pursues the previous work of the Schekman group, exploring the mechanisms of targeting of miRNAs into extracellular vesicles (EVs), or possibly exosomes, in HEK293 and U2OS cells. The authors had identified YBX1 as an RNA-binding protein required for the sorting of miR223 into CD63-expressing small EVs, probably mainly exosomes. Here they further observed that YBX1 directly binds miR223, which also binds to another protein, YBAP1, localized in mitochondria, where it sequesters miR223, thus preventing its targeting to MVBs' intraluminal vesicles. They observe the association of YBX1-containing P-bodies in the cytoplasm with mitochondria and with enlarged Rab5-endosomes and propose that this step is required for the exchange of miR223 for its loading into MVBs intraluminal vesicles and future exosomes.

      The biochemical parts of the article, with quantitative experiments to decipher the molecular interactions of YBX1 and YBAP1 with miR223, are nicely performed and convincing. By contrast, the parts on the involvement of YBX1 and of YBAP1 in the release of miR223 in EVs or exosomes are more correlative than demonstrative and lack some controls. In particular, it is far-fetched to conclude from the observed movement (which may be serendipitous) of 2 P-bodies between mitochondria and enlarged endosomes (without any visualization of the miR) that this movement may be instrumental in the transfer of miR223 between mitochondria and putative exosomes (figures 6 and model in figure 7).

      The experiments designed to evidence the mechanisms of miR223 release in EVs are also not sufficiently controlled and analysed to really support the interpretations. And the EV isolation steps are not performed in a way that supports the actual exosomal nature (i.e. exclusive origin from multivesicular endosome) of the EV analysed.

      Another experimental weakness is that the authors make strong conclusions on MVBs and exosomes when they only analyse artificially-enlarged endosomes induced by overexpression of mutant Rab5. Although this approach has been used previously and shown CD63 in these induced enlarged compartments, it is an artificial blocking of normal endosomal trafficking, and may not reflect the situation of intracellular trafficking of miR223 in normal cells.

    1. Author Response

      Reviewer #1 (Public Review):

      In this study, Shin and colleagues investigate the role of the posttranslational modification of the DNA methyltransferase by covalent linkage of the N-Acetylglucosamine (O-GlcNAc).

      The authors present compelling evidence showing that a prolonged high fat/sucrose diet causes global protein O-GlcNAcylation in the liver and DNMT1 is among the proteins that increase their O-GlcNAc level. This result is significant because of the paucity of in vivo data addressing the interplay between metabolism and protein O-GlcNAcylation. The paper also shows that DNMT1's O-GlcNAcylation level correlated to the extracellular glucose levels in other cell types.

      Using mass spectrometry, the authors identify S878 as the main site for O-GlcNAcylation. It is noteworthy that the mapping was performed with hyper-O-GlcNAcylated cells and may be different in a physiological situation. To investigate how O-GlcNAcylation of S878 of DNMT1 impacts its activity and ultimately DNA methylation patterns, Shin and colleagues mostly use a cellular model of hyper O-GlcNAcylation induced by the combination of high glucose and a chemical inhibitor of OGA (the only enzyme responsible for O-GlcNAc removal). The data shows that increased O-GlcNAcylation resulting from the combination of high glucose and OGA inhibition causes a reduction of DNMT1 activity and local loss of DNA methylation specifically at partially methylated domains.

      This study brings completely new knowledge on the regulatory function of glycosylation of DNMT1 and its impact on its methyl-transferase activity and downstream genomic methylation. Furthermore, the manuscript introduces new data on the interplay between cellular metabolism and O-GlcNAcylation on DNMT1 and other proteins. The experiments are well-controlled, and their interpretation is sound. This study should be of special interest to the fields of fundamental and environmental epigenetics, as well as metabolism.

      The main limitation of the study is the convolution of the functional experiments where the perturbation is a combination of high glucose and chemical inhibition of OGA. The relative contribution of the two variables is partially addressed in Figure 3-figure supplement 1B which shows that high glucose increases DNMT1 activity (Hep3B cells) while Figure 3D shows that high glucose when combined with OGA inhibitor decreases DNMT1 activity (Hep3B cells). As discussed, the data suggest that high-glucose and OGA inhibition may have an antagonistic effect on DNMT1 activity. An experiment of treatment of the cells with the OGA inhibitor in physiological glucose conditions would address this gap of knowledge.

      We thank the reviewer for the suggestion. The physiological glucose levels are between 5 to 7 mM, and 25mM is in hyperglycemic range, which corresponds to severe diabetes. The new Figure 1A shows TMG treatment with physiological glucose conditions. We have included new WB data of 5mM glucose, 5mM glucose + TMG, 25mM glucose, and 25mM glucose + TMG (Figure 1A).

      To understand the impact of the environment (in this study: extracellular glucose level) on the epigenome, one should keep in mind the variation of cytosine methylation patterns between individuals and over time. A recent large-scale profiling of DNA methylation of 137 individuals shows a near absence of individual variation between replicates of the same cell type, suggesting that genomic methylation patterns are largely insensitive to the environment (https://doi.org/10.1038/s41586-022-05580-6).

      Comparative methylomes of healthy and diabetic individuals are needed to examine the medical significance of the findings presented here. It is possible that the modulation of DNMT1 activity by O-GlcNAc modification is relevant for a specific cell type or developmental stage that remains to be discovered.

      We thank the reviewer for the suggestion. While the present study is focused on the functional impact of glucose concentrations on O-GlcNAcylation of DNMT1, the extension of this work to diabetic individuals is a goal for a follow up project.

      Reviewer #2 (Public Review):

      I've read the manuscript by Shin et al with great interest. The authors describe the identification of O-GlcNAcylation of DNMT1 and the impact this modification has on the maintenance activity of DNMT1 genome-wide and that modification of S878 leads to enzyme inhibition. The manuscript is written in a clear and understandable way making it easy for the reader to understand the logic as well as the steps of the experimental approach.

      The authors identify O-GlcNAcylation of DNMT1 in a number of different cell lines by combining inhibition studies and WB and further on they identify the modification sites with LC/MS, predictions, and mutational studies. I really like the experimental approach, which while being straightforward (albeit technically challenging), is powerful and well-controlled in this case to unequivocally prove the modification of DNMT1 and identify the site. However, mutation of the two identified modification sites does not remove all the O-GlcNAcylation signal associated with DNMT1, thus possibly not all the possible sites were identified. While this is not a criticism of this manuscript, it would be interesting to know what other sites are modified and the enzymatic/biological effects associated.

      We completely agree with the reviewer. As the O-GlcNAc band was also detected in double mutated DNMT1 (Figure 2D), it is expected that undetected O-GlcNAcylated sites will exist. This is a limitation of current MS analysis and is known to be difficult to detect in the case of modified sites located at both 5’- and 3’- ends of the protein or around the site cut by endoprotease such as trypsin. In follow up work we plan to detect more diverse O-GlcNAc modified sites using more types of endoproteases and observe changes in the phenotype of various cells accordingly.

      Also, the authors isolate the modified DNMT1 from cells using immunoprecipitation, which is indeed useful to study the changes in catalytic activity but does not provide any information if the cellular localisation of modified DNMT1 changes.

      We apologize for this oversight. We have added a DNMT1 localization assay via immunofluorescence (IF) in the revised manuscript (Figure 3—figure supplement 3). We found no difference in DNMT1 localization between wild type and S878A mutants.

      Subsequently, the authors checked the impact of high glucose diet on the genome-wide DNA methylation patterns. The observed effects (Fig 4A) are very strong, almost as strong as observed with Aza treatment and therefore I wonder if LINE/IAP or other elements are getting activated (as observed with genome-wide demethylation with Aza).

      We thank the reviewer for the suggestion. Changes in methylation of LINE-1 by hyperglycemia condition are displayed in Figure 4—figure supplement 4. In the case of LINE-1, DNA methylation is lost globally in hyperglycemia conditions. While beyond the scope of this study, a more thorough examination of the impact of the observed loss of methylation under high glucose conditions is of interest.

      Do the authors see any changes in cell phenotype, slower/faster proliferation, or increased apoptosis due to the activation of mobile elements (not only ROS)?

      This is also a very interesting idea. We plan on further investigating this as part of a follow up study.

      Another point is that the S878A mutant seems not to be able to fully maintain the DNA methylation (Fig 4A). Does O-GlcNAcylation recruit any additional interactors? Given that the authors immunoprecipitated DNMT1 and use it for activity assay, it is possible, that the modification attracts an additional protein factor that could in turn inhibit DNMT1 activity (as observed). Therefore, the observed kinetic effect could be indirect, while still interesting and important, the mechanism of inhibition would be different.

      We thank the reviewer for the great suggestions. According to Figure 4A, in the case of mutated DNMT1, a slight methylation loss appears to occur in both conditions. There could be for a number of reasons. It may be due to interacting proteins or it may be caused by some damage of DNMT1 itself. A further investigation of this is planned as a follow up project.

      DNA methylation clock can be used to estimate the biological age of a tissue/cells. While not directly in the line of the manuscript, I was wondering if the DNA methylation changes in the high glucose diet would affect the methylation sites used for the DNAme clock. Meaning, would the cells/tissue epigenetically age faster when in high glucose media, and if the Ala mutant could provide resistance to that?

      We thank the reviewer for the interesting suggestion. We believe this is beyond the scope of this manuscript, but we'll consider this with interest in the future.

      In discussion, the authors write that this is the first investigation of O-GlcNAcylation in relation to DNA methylation, while this is true for DNMTs, TET enzymes, that oxidise 5mC and trigger active DNA demethylation have been shown before to also be modified.

      We have toned down the language throughout the revised manuscript. This is the first investigation into the maintenance of DNA methylation. Although there is a great deal of evidence regarding the important regulatory role of O-GlcNAcylation in gene regulation, a direct link with maintenance of DNA methylation has not previously been established.

      A nice and rigorous study, with important observations and connections to biological effects. It would be nice to prove that the effects are direct and not associated with other factors that could be recruited by the modification and impact the activity of DNMT1. I find it a bit surprising that phosphorylation of the target serine does not impact DNMT1 activity as well.

      We thank the reviewer for the positive comments and agree that there are many interesting avenues to follow up on this.

      Reviewer #3 (Public Review):

      The authors investigate the potential effect of OGlcNacylation on the activity of the DNA methyltransferase DNMT1.

      Some results that are convincingly obtained include:

      • There is more overall OGlcNacylation when Glucose concentration in the culture medium or the feed is high;

      • DNMT1 is OGlcNacylated, and more so in high glucose or on rich chow;

      • The position S878 can be OGlcNacylated;

      • The activity of transfected DNMT1 is decreased in high glucose conditions. This effect is lessened when S878 is mutated to A or D.

      Some results that are suggested but not fully backed by experimental data include:

      • This process happens to the endogenous protein under physiologically relevant conditions;

      We agree that we could not completely rule out endogenous DNMT1 in our experiments. We have adjusted the language in the revised manuscript to acknowledge this. However, we confirmed the change in activity of recombinant DNMT1 (Figure 3D), and also demonstrated the change in activity under physiological conditions (normal physiological glucose level vs hyperglycemic range) in Figure 3—figure supplement 1B. This is a result that directly shows that the activity of DNMT1 changes under physiological conditions. In addition, DNA hypomethylation due to high glucose has been previously reported, already (Kandilya et al., 2020; Lan et al., 2016). Our results suggest a possible mechanism for this.

      Kandilya, D., Shyamasundar, S., Singh, D.K., Banik, A., Hande, M.P., Stunkel, W., Chong, Y.S., and Dheen, S.T. (2020). High glucose alters the DNA methylation pattern of neurodevelopment associated genes in human neural progenitor cells in vitro. Sci Rep 10, 15676.

      Lan, C.C., Huang, S.M., Wu, C.S., Wu, C.H., and Chen, G.S. (2016). High-glucose environment increased thrombospondin-1 expression in keratinocytes via DNA hypomethylation. Transl Res 169, 91-101 e101-103.

      • This process is responsible for changes in DNA methylation, leading to changes in gene expression, leading to increased ROS and increased apoptosis.

      We confirmed that ROS levels increased under high glucose conditions through DCFH-DA fluorescence experiments (Figure 5A). In addition, γH2A.X fluorescence experiments showed that DNA damage was increased under high glucose conditions (Fig. 5B). On the other hand, in the case of the S878A mutant, DNA damage was reduced under hyperglycemic conditions compared to wild type DNMT1 despite an increase in ROS levels (Fig. 5B). Moreover, we verified that the DNA damage did not come from oxidative stress through 8-OHdG analysis (Figure 5—figure supplement 4). Therefore, DNA oxidative stress is suppressed by DNMT1 due to the increase of ROS under high glucose conditions. However, the reduction of DNA methylation by O-GlcNAcylation of DNMT1 induces apoptosis due to oxidative stress.

      Studying the connection between cellular metabolism and epigenetic phenomena is interesting. However, I feel that the article falls short of its aims because of the limits of the experimental system, some missing controls, and some data overinterpretation.

      We hope the reviewer finds our revised manuscript more suitable.

    2. eLife assessment

      This study explores the regulatory function of O-GlcNAcylation on DNA methyltransferase 1 and identifies serine 878 as the main target. This study is of interest to those in epigenetics and metabolism. The significance is important and the strength of the evidence is convincing.

    3. Reviewer #1 (Public Review):

      In this study, Shin and colleagues investigate the role of the posttranslational modification of the DNA methyltransferase by covalent linkage of the N-Acetylglucosamine (O-GlcNAc).

      The authors present compelling evidence showing that a prolonged high fat/sucrose diet causes global protein O-GlcNAcylation in the liver and DNMT1 is among the proteins that increase their O-GlcNAc level. This result is significant because of the paucity of in vivo data addressing the interplay between metabolism and protein O-GlcNAcylation. The paper also shows that DNMT1's O-GlcNAcylation level correlated to the extracellular glucose levels in other cell types.

      Using mass spectrometry, the authors identify S878 as the main site for O-GlcNAcylation. It is noteworthy that the mapping was performed with hyper-O-GlcNAcylated cells and may be different in a physiological situation. To investigate how O-GlcNAcylation of S878 of DNMT1 impacts its activity and ultimately DNA methylation patterns, Shin and colleagues mostly use a cellular model of hyper O-GlcNAcylation induced by the combination of high glucose and a chemical inhibitor of OGA (the only enzyme responsible for O-GlcNAc removal). The data shows that increased O-GlcNAcylation resulting from the combination of high glucose and OGA inhibition causes a reduction of DNMT1 activity and local loss of DNA methylation specifically at partially methylated domains.

      This study brings completely new knowledge on the regulatory function of glycosylation of DNMT1 and its impact on its methyl-transferase activity and downstream genomic methylation. Furthermore, the manuscript introduces new data on the interplay between cellular metabolism and O-GlcNAcylation on DNMT1 and other proteins. The experiments are well-controlled, and their interpretation is sound. This study should be of special interest to the fields of fundamental and environmental epigenetics, as well as metabolism.

      The main limitation of the study is the convolution of the functional experiments where the perturbation is a combination of high glucose and chemical inhibition of OGA. The relative contribution of the two variables is partially addressed in Figure 3-figure supplement 1B which shows that high glucose increases DNMT1 activity (Hep3B cells) while Figure 3D shows that high glucose when combined with OGA inhibitor decreases DNMT1 activity (Hep3B cells). As discussed, the data suggest that high-glucose and OGA inhibition may have an antagonistic effect on DNMT1 activity. An experiment of treatment of the cells with the OGA inhibitor in physiological glucose conditions would address this gap of knowledge.

      To understand the impact of the environment (in this study: extracellular glucose level) on the epigenome, one should keep in mind the variation of cytosine methylation patterns between individuals and over time. A recent large-scale profiling of DNA methylation of 137 individuals shows a near absence of individual variation between replicates of the same cell type, suggesting that genomic methylation patterns are largely insensitive to the environment (https://doi.org/10.1038/s41586-022-05580-6).

      Comparative methylomes of healthy and diabetic individuals are needed to examine the medical significance of the findings presented here. It is possible that the modulation of DNMT1 activity by O-GlcNAc modification is relevant for a specific cell type or developmental stage that remains to be discovered.

    4. Reviewer #2 (Public Review):

      I've read the manuscript by Shin et al with great interest. The authors describe the identification of O-GlcNAcylation of DNMT1 and the impact this modification has on the maintenance activity of DNMT1 genome-wide and that modification of S878 leads to enzyme inhibition.<br /> The manuscript is written in a clear and understandable way making it easy for the reader to understand the logic as well as the steps of the experimental approach.

      The authors identify O-GlcNAcylation of DNMT1 in a number of different cell lines by combining inhibition studies and WB and further on they identify the modification sites with LC/MS, predictions, and mutational studies. I really like the experimental approach, which while being straightforward (albeit technically challenging), is powerful and well-controlled in this case to unequivocally prove the modification of DNMT1 and identify the site. However, mutation of the two identified modification sites does not remove all the O-GlcNAcylation signal associated with DNMT1, thus possibly not all the possible sites were identified. While this is not a criticism of this manuscript, it would be interesting to know what other sites are modified and the enzymatic/biological effects associated.

      Also, the authors isolate the modified DNMT1 from cells using immunoprecipitation, which is indeed useful to study the changes in catalytic activity but does not provide any information if the cellular localisation of modified DNMT1 changes. Subsequently, the authors checked the impact of high glucose diet on the genome-wide DNA methylation patterns. The observed effects (Fig 4A) are very strong, almost as strong as observed with Aza treatment and therefore I wonder if LINE/IAP or other elements are getting activated (as observed with genome-wide demethylation with Aza). Do the authors see any changes in cell phenotype, slower/faster proliferation, or increased apoptosis due to the activation of mobile elements (not only ROS)? Another point is that the S878A mutant seems not to be able to fully maintain the DNA methylation (Fig 4A). Does O-GlcNAcylation recruit any additional interactors? Given that the authors immunoprecipitated DNMT1 and use it for activity assay, it is possible, that the modification attracts an additional protein factor that could in turn inhibit DNMT1 activity (as observed). Therefore, the observed kinetic effect could be indirect, while still interesting and important, the mechanism of inhibition would be different.

      DNA methylation clock can be used to estimate the biological age of a tissue/cells. While not directly in the line of the manuscript, I was wondering if the DNA methylation changes in the high glucose diet would affect the methylation sites used for the DNAme clock. Meaning, would the cells/tissue epigenetically age faster when in high glucose media, and if the Ala mutant could provide resistance to that?

      In discussion, the authors write that this is the first investigation of O-GlcNAcylation in relation to DNA methylation, while this is true for DNMTs, TET enzymes, that oxidise 5mC and trigger active DNA demethylation have been shown before to also be modified.

      A nice and rigorous study, with important observations and connections to biological effects. It would be nice to prove that the effects are direct and not associated with other factors that could be recruited by the modification and impact the activity of DNMT1. I find it a bit surprising that phosphorylation of the target serine does not impact DNMT1 activity as well.

    5. Reviewer #3 (Public Review):

      The authors investigate the potential effect of OGlcNacylation on the activity of the DNA methyltransferase DNMT1.

      Some results that are convincingly obtained include:<br /> - There is more overall OGlcNacylation when Glucose concentration in the culture medium or the feed is high;<br /> - DNMT1 is OGlcNacylated, and more so in high glucose or on rich chow;<br /> - The position S878 can be OGlcNacylated;<br /> - The activity of transfected DNMT1 is decreased in high glucose conditions. This effect is lessened when S878 is mutated to A or D.

      Some results that are suggested but not fully backed by experimental data include:<br /> - This process happens to the endogenous protein under physiologically relevant conditions;<br /> - This process is responsible for changes in DNA methylation, leading to changes in gene expression, leading to increased ROS and increased apoptosis.

      Studying the connection between cellular metabolism and epigenetic phenomena is interesting. However, I feel that the article falls short of its aims because of the limits of the experimental system, some missing controls, and some data overinterpretation.

    1. Author Response

      Reviewer #1 (Public Review):

      Overall, this manuscript exposes key gaps in patient care resulting from the pandemic, as well as the challenges and unmet needs felt by healthcare workers in cervical cancer screening. The authors’ findings on the struggles while regaining screening volume across the nation in a sustainable way, demonstrate that pre-existing weaknesses in the cancer control system were exacerbated by the pandemic and are integral to amend. The authors were able to identify these gaps in care and work environments through their synthesis of qualitative interviews. I applaud the use of such mixed methods, which emphasizes the complementary need for both quantitative and qualitative data. What could be better strengthened in the manuscript is the authors’ justification for statistical analyses within the context of the research question, and reporting of survey administration and management.

      The authors thank the reviewer for a thorough assessment of the manuscript. We have addressed the reviewer’s concerns regarding justification of statistical analyses in the Data Analysis, Quantitative survey data section, and reporting of survey administration and management in the Results, Quantitative survey data section.

      Reviewer #2 (Public Review):

      Fuzzell et al. conducted a mixed-method study looking into the possible impact of COVID-19 on clinician perceptions of cervical cancer screening. The authors examined how the pandemic-related staffing changes might have affected the screening and abnormal results follow-up during the period October 2021 through July 2022.

      They found that 80% of the clinicians experienced decreased screening during the start of the pandemic and that ≈67% reported a return to pre-pandemic levels. The general barriers for not returning to pre-pandemic levels were staffing shortages and problems with structural systems for tracking overdue patients and those in need of follow-up after abnormal screening tests.

      Strengths:

      There is a high focus on the consequences and the need for action to prevent the ongoing impact of COVID-19 on cervical cancer screening. Some of the actions mentioned by the authors could be the use of HPV self-sampling kits, and it is interesting to be provided knowledge on the clinicians' views on HPV self-sampling. Both are of high interest to the general population in the US. Throughout the discussion, the authors and their claims are supported by other studies.

      Weaknesses:

      The lack of a National representative sample, where 63% of the responding clinicians were practicing in the Northeast, affects the possibility of generalization of the results found in the study. The overrepresentation of white females is not addressed in the discussion. This composition could have affected the results, especially when the authors report a need to look at higher salaries and better childcare to maintain adequate staffing.

      The conclusions are mostly supported by the data, however, some aspects of the data analysis need to be clarified.

      We thank the reviewer for their constructive feedback. Despite our best efforts, we were unable to recruit a sample more representative of all US regions. We note this limitation in the discussion: “Notwithstanding efforts to achieve a regionally diverse sample, 63% of responding clinicians were practicing in the Northeast at the time of their participation. Given that COVID-19 policies varied widely by state, this regional imbalance may limit the generalizability of our results. Despite the oversample of clinicians in the Northeast, region was not a significant predictor of either outcome.” Also, we acknowledge the high enrollment of White women in our provider sample and now address this point in the discussion: “Similarly, our sample was 85% female and 70% White. Although ideally we would have included a sample that was more diverse with respect to race and gender, these characteristics are not disparate from the majority of clinicians who perform cervical cancer screening (e.g., race: Women’s Health NPs [77% White], active Ob/Gyns [67% White], all active physicians [64% White]; gender: all NPs [92% female], Ob/Gyns [64% female], all active physicians [37% female]).” Data describing these characteristics are reported in the Association of American Medical Colleges (AAMC) 2022 Physician Specialty Data Report and Executive Summary, the 2018 NPWH Women’s Health Nurse Practitioner Workforce Demographics and Compensation Survey: Highlights Report, and a published paper describing the characteristics of nurse practitioners in the US, which are cited in text.

      Reviewer #3 (Public Review):

      This US study presents findings from an online survey and in-person interviews of healthcare providers regarding themes associated with cervical screening in federally qualified health centres (FQHCs). The study provides insights during the post-acute phase of the pandemic into a range of areas, including perceived changes in the provision of cervical cancer screening services and the impact of the pandemic, staffing and systems barriers to cervical cancer screening, strategies for tracking missed screens and catch-ups, follow-up of abnormal screening results, as well as attitudes towards HPV self-sampling. Results indicate persisting pandemic-related impacts on patient engagement and staffing, as well as system barriers to effective screening, catch-up of missed screens and follow-ups. Taken together, these issues may lead to increases in cervical cancer in the long-term in populations serviced by these centres, if measures are not taken to adequately support them. Participants were recruited from various regions in the US, however, the study was not conducted using a nationally-representative sample. Although highlighted issues are informative, findings cannot be generalised and larger studies are warranted in the future to monitor cervical screening provision and outcomes in FQHCs.

      We thank the reviewer for their thorough assessment of the manuscript. In the discussion, we have made sure to note the non-nationally representative sample and need for continued monitoring of cervical cancer screening and related outcomes in underserved settings and communities.

    2. eLife assessment

      This US study presents findings from an online survey and in-person interviews of healthcare providers in areas associated with cervical screening provision during the post-acute phase of the pandemic. The findings are valuable as they provide insights into a range of areas, from healthcare characteristics to screening barriers and HPV self-sampling. The evidence supporting the claims of the authors is solid, although the inclusion of a nationally-representative sample of healthcare providers and a greater gender/ethnicity/racial mix of interviewees would have strengthened the study. The work will be of interest to public health scientists and a cancer prevention and control audience.

    3. Reviewer #1 (Public Review):

      Overall, this manuscript exposes key gaps in patient care resulting from the pandemic, as well as the challenges and unmet needs felt by healthcare workers in cervical cancer screening. The authors' findings on the struggles while regaining screening volume across the nation in a sustainable way, demonstrate that pre-existing weaknesses in the cancer control system were exacerbated by the pandemic and are integral to amend. The authors were able to identify these gaps in care and work environments through their synthesis of qualitative interviews. I applaud the use of such mixed methods, which emphasizes the complementary need for both quantitative and qualitative data. What could be better strengthened in the manuscript is the authors' justification for statistical analyses within the context of the research question, and reporting of survey administration and management.

    4. Reviewer #2 (Public Review):

      Fuzzell et al. conducted a mixed-method study looking into the possible impact of COVID-19 on clinician perceptions of cervical cancer screening. The authors examined how the pandemic-related staffing changes might have affected the screening and abnormal results follow-up during the period October 2021 through July 2022.

      They found that 80% of the clinicians experienced decreased screening during the start of the pandemic and that ≈67% reported a return to pre-pandemic levels. The general barriers for not returning to pre-pandemic levels were staffing shortages and problems with structural systems for tracking overdue patients and those in need of follow-up after abnormal screening tests.

      Strengths:

      There is a high focus on the consequences and the need for action to prevent the ongoing impact of COVID-19 on cervical cancer screening. Some of the actions mentioned by the authors could be the use of HPV self-sampling kits, and it is interesting to be provided knowledge on the clinicians' views on HPV self-sampling. Both are of high interest to the general population in the US. Throughout the discussion, the authors and their claims are supported by other studies.

      Weaknesses:

      The lack of a National representative sample, where 63% of the responding clinicians were practicing in the Northeast, affects the possibility of generalization of the results found in the study. The overrepresentation of white females is not addressed in the discussion. This composition could have affected the results, especially when the authors report a need to look at higher salaries and better childcare to maintain adequate staffing.

      The conclusions are mostly supported by the data, however, some aspects of the data analysis need to be clarified.

    5. Reviewer #3 (Public Review):

      This US study presents findings from an online survey and in-person interviews of healthcare providers regarding themes associated with cervical screening in federally qualified health centres (FQHCs). The study provides insights during the post-acute phase of the pandemic into a range of areas, including perceived changes in the provision of cervical cancer screening services and the impact of the pandemic, staffing and systems barriers to cervical cancer screening, strategies for tracking missed screens and catch-ups, follow-up of abnormal screening results, as well as attitudes towards HPV self-sampling. Results indicate persisting pandemic-related impacts on patient engagement and staffing, as well as system barriers to effective screening, catch-up of missed screens and follow-ups. Taken together, these issues may lead to increases in cervical cancer in the long-term in populations serviced by these centres, if measures are not taken to adequately support them. Participants were recruited from various regions in the US, however, the study was not conducted using a nationally-representative sample. Although highlighted issues are informative, findings cannot be generalised and larger studies are warranted in the future to monitor cervical screening provision and outcomes in FQHCs.

    1. Author Response

      Reviewer #2 (Public review):

      1) The systematic review includes data from some studies where PCOS is self-reported. While self-reported PCOS information has been found to be largely sensitive and specific, it would be of interest to know if prevalence ratios of mental health-related were impacted by self-reporting.

      Thank you for your insightful comment regarding the potential impact of self-reporting on the prevalence ratios of mental health-related outcomes in women with PCOS. We agree that this is an important factor to consider.

      In response, we have revisited all the studies included in our review. We have updated Supplemental Tables 2-4 to provide greater transparency and understanding. These revised tables now include a new column specifying the mental health assessment method used in each study. This update should allow for a more nuanced interpretation of the results, taking into account the potential impact of self-reporting.

      Furthermore, we conducted a sensitivity analysis by rerunning the meta-analysis to discern the potential influence of self-reported PCOS on our results, excluding the studies that relied solely on self-reported PCOS diagnosis. After we excluded studies where PCOS was self-reported, the point estimate for anxiety was similar whereas point estimates for depression and eating disorder were slightly higher but none of the estimates were different beyond chance compared to the original analysis. We believe these steps significantly strengthen the clarity and robustness of our findings (Line 314; Supplemental Tables 7 and 8).

      2) Likewise, the screening vs self-reported nature of the mental health disorders is not clear from the information included in the characteristics table.

      We have modified our Supplemental Tables 2-5 to include a column detailing the method of ‘Mental Health Assessment’. We should note that the majority of the studies directly assessed mental health using a variety of validated questionnaires. We have also included in the Discussion a section emphasizing that some of the studies included in the review relied on self-reported PCOS diagnosis and its potential impact. We also highlighted that while self-reported information is generally reliable, it is subject to potential bias that could impact the prevalence ratios of mental health-related conditions (Line 460).

      3) Calculated prevalence ratios were compared with prevalence values for the general population to determine the excess prevalence. However, the source of these general population statistics (i.e., whether these figures come from the control data in the included studies or other sources) is not clear.

      Thank you for raising this important point. We have now clarified in our Methods section that the general population statistics used for determining excess prevalence were derived from the control data in the included studies. We hope this provides the necessary transparency for our approach in calculating and interpreting the prevalence ratios (Line 210).

      4) The estimated costs for anxiety-, depression- and eating disorder-related care are accessed in published papers and used to calculate the excess costs. Conclusions would be strengthened by a defence of these figures, particularly for anxiety where the source paper is from 1999.

      Thank you for your insightful comment. We agree that providing a justification for our choice of cost estimates, especially for the anxiety care cost from a 1999 study, would strengthen our conclusions. The 1999 source was selected because it is a seminal study that offers a comprehensive breakdown of anxiety-related care costs. Despite its age, this paper is often cited in contemporary research due to its rigorous methodology and the granularity of its cost analysis. Adjusted for inflation, its findings still provide an insightful comparison point for current data. To ensure that these figures accurately represent present-day costs, we have adjusted them for inflation using the medical care inflation calculator. Our choice of these specific studies was based on their rigorous methodology, the detailed breakdown of costs, and their relevance to our targeted age groups. The aforementioned adjustments and justifications ensure that these figures aptly represent the present-day costs of treating these conditions.

      Similarly, the 2021 papers on depression and eating disorders present comprehensive and up-to-date analyses of the economic burdens associated with these conditions. These papers were selected for their rigorous methodologies, comprehensive cost breakdowns, and alignment with our age-specific focus. The Greenberg et al. (2021) paper, for example, is an authoritative source that provides detailed analysis on the economic burden of adults with major depressive disorder. Likewise, the paper by Streatfeild et al. (2021) offers a meticulous investigation into the socio-economic cost of eating disorders in the U.S., making it an apt choice for our study. We recognize the necessity of providing a robust justification for our choice of these particular papers, and we have endeavored to do so in our Methods section, thus reinforcing the transparency of our approach. We have clarified this in our Methods section to make our approach more transparent to readers (Line 225).

      5) An inflation tool is used to adjust the figure, but this does not take into account changes in treatment or practice since this estimate was made. The accuracy of these estimated figures is central to the final conclusions.

      Thank you for your valuable comment. We do note that the inflation figures used are a healthcare-specific inflation factor, as healthcare inflation differs from general consumer inflation. However, we agree that the inflation-adjusted figures do not necessarily account for changes in treatment practices since the original estimate was made, assuming these changes would alter the cost of care. We have added a discussion of this limitation in our manuscript and proposed future studies to validate these estimates using more recent data (Line 473).

    2. eLife assessment

      This important paper describes a valuable systematic review and meta-analysis of mental health problems in polycystic ovary syndrome (PCOS) that drive the excess economic burden associated with this common endocrine disorder. Interestingly, the cost of the diagnostic evaluation is only a relatively minor part of the total costs, but mental health disorders were identified as a significant component of the economic burden. These solid findings could not have been anticipated intuitively and are of considerable value for public health prioritization of PCOS.

    3. Reviewer #1 (Public Review):

      The aim of this study was to evaluate the increased prevalence of mental health (MH) disorders such as depression, anxiety, eating disorders, and postpartum depression in patients with polycystic ovary syndrome (PCOS) the most common reproductive disorder affecting about one in seven reproductive-aged women worldwide. The consequences of excess economic burden were estimated.

      Meta-analyses were performed using the Der Simonian-Laird random-effects model to compute pooled estimates of prevalence ratios for the associations between PCOS and these MH disorders, and then the excess direct costs in U.S. dollars (USD) for women suffering from PCOS were estimated.

      After screening the articles by title/abstract, 25 articles were selected for their quality according to the Newcastle-Ottawa scale. These studies included a control group. The data showed an increase in the prevalence ratios for each of the selected mental health disorder items: anxiety 1.42, depression 1.65, and eating disorders 1.48. The additional direct health care costs associated with these disorders were estimated to be $4.261 billion per year in 2021 USD.

      The authors extended their previous report that the total cost of evaluating and providing care to reproductive-aged PCOS women in the United States was $4.36 billion. Interestingly, the cost for diagnostic evaluation including laboratory accounted for a relatively minor part of the total costs (approximately 2%). In the present study, mental health disorders were clearly identified as a part of the excess economic burden. Their cost is estimated at $4.261 billion/year. These results were not anticipated intuitively and are of value for prioritization of the disorder as a public health priority.

      Provided that the study is validated for extraction of a meta-analysis, the data are of great interest not only for economic issues but also for early consideration of the mental distress of PCOS patients that has long been underestimated. Several studies have expressed patient resentment of delayed diagnosis and imperfect management, including the physical damage of hyperandrogenism and the associated metabolic syndrome. This medico-economic approach to chronic diseases with a strong impact on quality of life contributes to the global management of PCOS, which is a primary demand of patients.

    4. Reviewer #2 (Public Review):

      Yadav et al have performed a careful systematic review and meta-analysis of mental health disorder prevalence ratios in PCOS to estimate the mental health-related excess economic burden associated with this common endocrine disorder. Using random effect modelling of prevalence ratios from quality-assessed, peer-reviewed publications, they determine the excess PCOS-related prevalence and healthcare costs associated with anxiety, depression, and eating disorders to be greater than $4 billion USD per year. In conjunction with previously reported direct economic burden estimates for PCOS, they determine that PCOS healthcare costs exceed $15 billion USD per year (in the US alone) and that mental health disorder-related costs account for nearly one-third of these costs. The findings of this paper will be impactful for a broad field of clinical and bench scientists investigating PCOS, endocrinologists, general practitioners, health economists, and policymakers. The findings of this paper demonstrate the significant contribution that mental health-related pathology makes to the total economic burden associated with PCOS and present a strong case for additional research and policy investment into this underfunded area.

      The important findings and claims presented in this paper are mostly clearly presented and well supported by strong evidence and careful analysis. However, some additional clarity and rationalisation of referenced healthcare cost input to the model would strengthen the conclusions.

      Strengths:<br /> This paper clearly describes the inclusion criteria and characteristics of the included studies. The papers included were quality assessed using a well-regarded assessment tool and only those with high-quality information were included in subsequent meta-analyses. Publication bias was assessed by multiple methods and data were interpreted accordingly.

      The authors combine their mental health-related findings with previously reported economic burden estimates for specific PCOS-related care and treatment to provide a comprehensive estimation of PCOS-related healthcare costs in the US. They discuss these findings in relation to healthcare-related costs reported for other prevalent disorders and make a compelling case for prioritising research and investment into PCOS.

      An important observation made by the authors is the relatively small contribution to PCOS economic burden made by diagnostic evaluation, supporting quality diagnosis and evaluation as a cost-effective measure to improve PCOS patient treatment.

      Weaknesses:<br /> The systematic review includes data from some studies where PCOS is self-reported. While self-reported PCOS information has been found to be largely sensitive and specific, it would be of interest to know if prevalence ratios of mental health-related were impacted by self-reporting. Likewise, the screening vs self-reported nature of the mental health disorders is not clear from the information included in the characteristics table.

      Calculated prevalence ratios were compared with prevalence values for the general population to determine the excess prevalence. However, the source of these general population statistics (i.e., whether these figures come from the control data in the included studies or other sources) is not clear. The estimated costs for anxiety-, depression- and eating disorder-related care are accessed in published papers and used to calculate the excess costs. Conclusions would be strengthened by a defence of these figures, particularly for anxiety where the source paper is from 1999. An inflation tool is used to adjust the figure, but this does not take into account changes in treatment or practice since this estimate was made. The accuracy of these estimated figures is central to the final conclusions.

    1. Author Response

      Reviewer #1 (Public Review):

      GSK3 is a multi-tasking kinase that recognises primed (i.e. phosphorylated) substrates. One of the mechanisms by which the activity of GSK3 can be regulated is through N-terminal (pSer9) phosphorylation. In this case, the phosphorylated N-terminus turns into a pseudo-substrate that occupies the substrate binding pocket and thus inhibits the activity of GSK3 towards its real substrates.

      One outstanding question is how this autoinhibitory mechanism can affect some, but not all signaling pathways that GSK3 is involved in. One example is WNT/CTNNB1 signaling. Here, GSK3 plays a central role in the turnover of CTNNB1 in the absence of WNT, but this pool of GSK3 is not affected by pSer9 phosphorylation.

      Gavagan et al. address this question using an in vitro approach with purified proteins. They identify a role for AXIN1 in protecting the "WNT signaling pool" of GSK3 from the auto- inhibition that occurs upon pSer9 phosphorylation.

      Specifically, they show that i) GSK3-pSer9 is less capable of binding and phosphorylating primed CTNNB1 - thus suggesting that GSK3-pSer9 does not contribute to WNT signaling, ii) in the presence of AXIN1, GSK3-pSer9 becomes more capable of binding and phosphorylating CTNNB1 - suggesting that Axin can promote binding of GSK3 and CTNNB1 even when the primed binding pocket on GSK3 is blocked initially, iii) AXIN1 specifically prevents the PKA mediated phosphorylation of GSK3B on pSer9 - while leaving the phosphorylation of other PKA substrates unaffected.

      Strengths:

      • The authors use an in vitro system in which they can reconstitute different interactions and reactions using purified proteins, thus allowing them to zoom in on specific biochemical events in isolation.

      • The authors measure the phosphorylation of primed substrates (pSer45-CTNNB1 or WNT- independent substrates) and quantify specific kinetic parameters (kcat, KM, and kcat/KM) - of wildtype non-phosphorylated GSK3B, pSer9GSK3B, or the non-phosphorylatable S9A-GSK3B, either in the presence or absence of AXIN1 (or an AXIN1 fragment).

      • The experiments appear to be well-controlled and the results appear to be interpreted correctly.

      Weaknesses:

      • Key experiments (e.g. Figures 2 and 3) are described as being performed as n=3 technical replicates rather than independent/biological replicates.

      We suggest that the replicates described in our work can properly be described as biological replicates, and we have updated the manuscript accordingly. We apologize for the confusion and elaborate on our reasoning below.

      Each replicate reported for our in vitro kinetic assays is an independent reaction prepared in a separate reaction vessel, and replicates were analyzed on separate gels. Thus, each reaction is a distinct biological sample and should have been described as a biological replicate. A technical replicate would have been repeat measurements of the same timepoint from a single reaction.

      Our original description as technical replicates was based on the notion that each replicate came from the same protein purification (biological sample). However, an analogy to cell culture experiments can illustrate why our initial reasoning was incorrect. In a cell culture experiment, cells from the same initial source are typically split into independent wells for biological replicates. Similarly, our proteins come from the same initial source but are split into independent reaction vessels for biological replicates.

      The critical point is that, regardless of the precise terminology, our replicates capture the variability between independent experiments.

      • The validation in a biologically relevant setting (i.e. a cellular context) is limited to Figure 4C, which shows that over-expression of AXIN1 reduces the total levels of pSer9-GSK3.

      The biochemical experiments presented in our work address a critical gap in the signaling field and, together with the in vivo validation in Figure 4C, establish a model that was previously speculative. We suggest that further in vivo experiments are beyond the scope of the current manuscript.

      The authors convincingly show that AXIN1 can play a role in shielding GSK3 from auto- inhibition. As it stands, the impact of this work on the field of WNT/CTNNB1 signaling is likely to remain limited. This is mainly due to the reason that the mechanism by which AXIN1 shields the WNT/CTNNB1 signaling pool of GSK3 from pSer9 inhibition remains unresolved. Based on the fact that a mini AXIN1 (i.e. an AXIN1 fragment) behaves the same as WT AXIN1, the authors conclude that AXIN1 likely causes allosteric changes on GSK3 but is less likely to block PKA from binding. They cannot conclusively show this, however, as they do not have evidence in favour of one or the other explanation.

      We thank the reviewer for this important comment which details the central concern raised in the review process. To address this point, we have collected additional biochemical data that conclusively shows that the Axin shielding effect is allosteric and not a steric block. We demonstrated that a minimal, 27 amino acid Axin peptide produces the same GSK3β shielding behavior as full length Axin and miniAxin. The minimal Axin peptide does not sterically occlude the GSK3β phosphorylation site. This data is included in a revised Fig 4A and described on lines 115-120 of the revised manuscript.

      However, this study does offer more insight into the compartmentalisation of GSK3 and the quantitative parameters may be used in computational models describing the different cellular activities of GSK3.

      This work also has conceptual significance: Scaffold proteins are known to promote signal transduction by bringing proteins together (often: kinases and substrates). Here, Gavagan et al. show that AXIN1 also plays a second role, namely in protecting one of its binding kinases (GSK3) from inhibitory signals. This could potentially hold for other scaffolding proteins as well.

      Reviewer #2 (Public Review):

      Gavagan et al. investigated the role of the scaffolding protein, Axin, in the cross-pathway inhibition of GSK3b. The authors utilize reconstituted Axin, b-catenin, GSK3b, and protein kinase A to test 2 models. In the first model, the formation of the complex consisting of Axin, b-catenin, and GSK3b overcomes inhibitory phosphorylation of serine 9 of GSK3b. In the second model, the binding of Axin to GSK3b inhibits serine 9 phosphorylation through allosteric effects. Previous literature has established that the phosphorylation of serine 9 of GSK3b inhibits its kinase activity. To provide a quantitative measure of inhibition, the authors determine the binding affinity and catalytic efficiency of GSK3b in comparison to GSK3b phosphoS9 towards b-catenin. Interestingly, the data demonstrate a 200-fold decrease in Kcat/Km and 7 fold increase in Km. It is unclear why serine 9 mutation to alanine increases the rate of B-catenin phosphorylation more than the GSK unphosphorylated protein in figure S10.

      We thank the reviewer for catching this inconsistency. In the Michaelis-Menten plots presented in the main text (Figure 2 & Figure 3D), rates for unphosphorylated GSK3β and GSK3β_S9A are indistinguishable. These plots were used to determine the kinetic parameters reported in Table S1 (now Supplementary file 1a). The purpose of Figure S10 (now Figure 2-figure supplement 8) was to confirm that these reactions were first order (linear) in enzyme concentration, but the reviewer is correct to flag the inconsistency in absolute rates. In Figure S10A (now Figure 2-figure supplement 8A), the rates for unphosphorylated GSK3β were ~2-3-fold lower than expected.

      We have reanalyzed the original frozen reaction timepoints on new western blots. The results were identical for unphosphorylated GSK3β and GSK3β_S9A, resolving the apparent discrepancy. Upon review of the original western blot images, we noted that they were relatively noisy, potentially indicating incomplete blot transfer or an antibody going bad. Because we were able to reanalyze the original samples and obtained internally consistent results, we suggest that the updated data should replace the original data. The updated data are included in a revised Figure S10A (now Figure 2-figure supplement 8A).

      Next, the authors tested if the addition of Axin could overcome this inhibition. Although the addition of Axin decreases the Km, thereby producing a 20-fold increase in catalytic efficiency, the addition of Axin does not rescue the catalytic turnover of the phosphorylated GSK3b. Hence, the authors propose that Axin does not rescue the kinase activity of GSK3b from the inhibitory effects of serine 9 phosphorylation.

      Next, the authors test if Axin protects GSK3b from phosphorylation by the upstream kinase PKA. Excitingly, the data show a decrease in binding affinity and catalytic efficiency of PKA with GSK3b phosphoS9 in comparison to GSK3b. The binding of Axin inhibits GSK3b serine 9 phosphorylation by PKA but does not inhibit the phosphorylation of other PKA substrates such as Creb. The authors demonstrate that a fragment of Axin, residues 384-518, behaves similarly to the full-length Axin to shield GSK3b from phosphorylation. However, it is unclear how this fragment may bind in the destruction complex and if Axin has allosteric effects on GSK3b.

    2. eLife assessment

      This study presents a valuable and elegant kinetic analysis of the GSKbeta activity as a function of phosphorylation and Axin binding - providing insights into critical steps of Wnt pathway signaling. The results will be of big use to the broader signaling community, however, the incomplete dissection of the mechanism by which Axin binding inhibits GSKbeta inhibitory phosphorylation remains a weakness of this study. The work will be of broad interest to cell biologists and biochemists.

    3. Reviewer #1 (Public Review):

      GSK3 is a multi-tasking kinase that recognises primed (i.e. phosphorylated) substrates. One of the mechanisms by which the activity of GSK3 can be regulated is through N-terminal (pSer9) phosphorylation. In this case, the phosphorylated N-terminus turns into a pseudo-substrate that occupies the substrate binding pocket and thus inhibits the activity of GSK3 towards its real substrates.

      One outstanding question is how this autoinhibitory mechanism can affect some, but not all signaling pathways that GSK3 is involved in. One example is WNT/CTNNB1 signaling. Here, GSK3 plays a central role in the turnover of CTNNB1 in the absence of WNT, but this pool of GSK3 is not affected by pSer9 phosphorylation.

      Gavagan et al. address this question using an in vitro approach with purified proteins. They identify a role for AXIN1 in protecting the "WNT signaling pool" of GSK3 from the auto-inhibition that occurs upon pSer9 phosphorylation.<br /> Specifically, they show that i) GSK3-pSer9 is less capable of binding and phosphorylating primed CTNNB1 - thus suggesting that GSK3-pSer9 does not contribute to WNT signaling, ii) in the presence of AXIN1, GSK3-pSer9 becomes more capable of binding and phosphorylating CTNNB1 - suggesting that Axin can promote binding of GSK3 and CTNNB1 even when the primed binding pocket on GSK3 is blocked initially, iii) AXIN1 specifically prevents the PKA mediated phosphorylation of GSK3B on pSer9 - while leaving the phosphorylation of other PKA substrates unaffected.

      Strengths:<br /> - The authors use an in vitro system in which they can reconstitute different interactions and reactions using purified proteins, thus allowing them to zoom in on specific biochemical events in isolation.<br /> - The authors measure the phosphorylation of primed substrates (pSer45-CTNNB1 or WNT-independent substrates) and quantify specific kinetic parameters (kcat, KM, and kcat/KM) - of wildtype non-phosphorylated GSK3B, pSer9GSK3B, or the non-phosphorylatable S9A-GSK3B, either in the presence or absence of AXIN1 (or an AXIN1 fragment).<br /> - The experiments appear to be well-controlled and the results appear to be interpreted correctly.

      Weaknesses:<br /> - Key experiments (e.g. Figures 2 and 3) are described as being performed as n=3 technical replicates rather than independent/biological replicates.<br /> - The validation in a biologically relevant setting (i.e. a cellular context) is limited to Figure 4C, which shows that over-expression of AXIN1 reduces the total levels of pSer9-GSK3.

      The authors convincingly show that AXIN1 can play a role in shielding GSK3 from auto-inhibition. As it stands, the impact of this work on the field of WNT/CTNNB1 signaling is likely to remain limited. This is mainly due to the reason that the mechanism by which AXIN1 shields the WNT/CTNNB1 signaling pool of GSK3 from pSer9 inhibition remains unresolved. Based on the fact that a mini AXIN1 (i.e. an AXIN1 fragment) behaves the same as WT AXIN1, the authors conclude that AXIN1 likely causes allosteric changes on GSK3 but is less likely to block PKA from binding. They cannot conclusively show this, however, as they do not have evidence in favour of one or the other explanation.

      However, this study does offer more insight into the compartmentalisation of GSK3 and the quantitative parameters may be used in computational models describing the different cellular activities of GSK3.

      This work also has conceptual significance: Scaffold proteins are known to promote signal transduction by bringing proteins together (often: kinases and substrates). Here, Gavagan et al. show that AXIN1 also plays a second role, namely in protecting one of its binding kinases (GSK3) from inhibitory signals. This could potentially hold for other scaffolding proteins as well.

    4. Reviewer #2 (Public Review):

      Gavagan et al. investigated the role of the scaffolding protein, Axin, in the cross-pathway inhibition of GSK3b. The authors utilize reconstituted Axin, b-catenin, GSK3b, and protein kinase A to test 2 models. In the first model, the formation of the complex consisting of Axin, b-catenin, and GSK3b overcomes inhibitory phosphorylation of serine 9 of GSK3b. In the second model, the binding of Axin to GSK3b inhibits serine 9 phosphorylation through allosteric effects.

      Previous literature has established that the phosphorylation of serine 9 of GSK3b inhibits its kinase activity. To provide a quantitative measure of inhibition, the authors determine the binding affinity and catalytic efficiency of GSK3b in comparison to GSK3b phosphoS9 towards b-catenin. Interestingly, the data demonstrate a 200-fold decrease in Kcat/Km and 7 fold increase in Km. It is unclear why serine 9 mutation to alanine increases the rate of B-catenin phosphorylation more than the GSK unphosphorylated protein in figure S10. Next, the authors tested if the addition of Axin could overcome this inhibition. Although the addition of Axin decreases the Km, thereby producing a 20-fold increase in catalytic efficiency, the addition of Axin does not rescue the catalytic turnover of the phosphorylated GSK3b. Hence, the authors propose that Axin does not rescue the kinase activity of GSK3b from the inhibitory effects of serine 9 phosphorylation.

      Next, the authors test if Axin protects GSK3b from phosphorylation by the upstream kinase PKA. Excitingly, the data show a decrease in binding affinity and catalytic efficiency of PKA with GSK3b phosphoS9 in comparison to GSK3b. The binding of Axin inhibits GSK3b serine 9 phosphorylation by PKA but does not inhibit the phosphorylation of other PKA substrates such as Creb. The authors demonstrate that a fragment of Axin, residues 384-518, behaves similarly to the full-length Axin to shield GSK3b from phosphorylation. However, it is unclear how this fragment may bind in the destruction complex and if Axin has allosteric effects on GSK3b.

    1. Author Response

      Reviewer #1 (Public Review):

      Various parts of the premotor cortex have been implicated in choices underlying decisionmaking tasks. Further, norepinephrine has been implicated in modulating behavior during various decision-making tasks. Less work has been done on how noradrenergic modulation would affect M2 activity to alter decision-making, nor is it clear whether noradrenergic modulation effects on activity would differ between the male and female sexes.

      This manuscript addresses some of these questions.

      • In particular, clear sex differences in task engagement are seen.

      • May also show some interesting differences and distributions of β2 adrenergic receptors in M2 between males and females.

      We thank the reviewer for their summary of our findings and thoughtful critique of our manuscript. In our revised manuscript we have taken measures to address the reviewer’s comments in line (blue edits in text and revised figures) with direct responses outlined below. We believe these revisions improve the scientific rigor of our findings and provide relevant context for our studies. We hope that they have sufficiently addressed the reviewer’s concerns.

      Less clear is the specificity of systemic antagonism of β adrenergic receptors on the changes in M2 activity reported. As propranolol was given systemically, changes in M2 firing rates could also be due to broader circuit (indirect) activity changes. As it was not given locally, nor were local receptor populations manipulated, one is unable to make the conclusion that changes in neural activity are due to the direct effects of adrenergic receptors within M2 populations.

      We agree that propranolol driven changes in anterior M2 activity may arise via multiple mechanisms, including direct action on the adrenoreceptors within M2, and indirect action via other regions that project to M2. Although locally activating inhibitory interneurons within M2 is sufficient to disrupt cueguided action plans and behavior in a 2AFC task (Inagaki et al., 2018), our noradrenergic manipulation was not restricted to M2. We have clarified our conclusions and provided additional discussion to highlight that propranolol actions were multifaceted and that direct actions in M2 are likely working in concert with propranolol mediated actions in other regions.

      Also not clear, is the contribution of M2 to this task, and whether the changes in M2 activity patterns observed are directly responsible for the behavioral disruptions measured.

      We have revised our introduction and discussion to more clearly outline the critical role of cue-guided action plans in M2 for successful behavior in 2AFC tasks. Suppression of cue-guided activity in M2 results in behavioral performance at near chance levels, similar to what we saw in females after propranolol (Guo et al., 2017; Inagaki et al., 2018; Li et al., 2016). Furthermore, targeted photostimulation of action plan encoding neurons in M2 is sufficient to drive behavioral responses (Daie et al., 2021). In our investigations it is plausible to expect propranolol related disruptions in other cognitive, sensory or motor regions. Based on the strong foundational evidence for M2 activity in 2AFC, the propranolol driven changes in anterior M2 in females, whether direct or indirectly mediated, are likely sufficient to drive behavioral disruptions in accuracy and/or trial completion.

      Reviewer #2 (Public Review):

      This paper by Rodbarg et al describes an interesting study on the role of beta noradrenergic receptors in action-related activity in the premotor cortex of behaving rats. This work is precious because even if the action of neuromodulatory systems in the cortex is thought to be critical for cognition, there is very little data to actually substantiate the theories. The study is well conducted and the paper is well written. I think, however, that the paper could benefit from several modifications since I can see 3 major issues:

      We thank the reviewer for their generous comments on the potential impact of our manuscript as well as their suggestions to improve this work. Below we outline responses to specific comments raised by the reviewer in addition to adresing them in the revised manuscript. We hope these responses sufficiently address the reviewer’s concerns.

      Both from a theoretical and from a practical point of view, the emphasis on 'cue-related' activity and the potential influence of NA on sensory processing is problematic. First, recent studies in rodents and primates have clearly demonstrated that LC activation is more closely related to actions than to stimulus processing (see Poe et al, 2020 for review).

      Indeed during optimal performance the peaks of LC activity are larger when PETH are aligned to action initiation rather than the cue itself (Clayton et al., 2004). This alignment resolves variability in decision processing times and omitted cues. Although LC responses align with action they are evoked by, and occur after, cue presentation with LC responses to visual cues occurring ~ 60ms after presentation (Aston-Jones & Bloom, 1981). The same behavioral action without preceding task relevant cues does not evoke an LC response (Rajkowski et al., 2004)

      In our current study cues initiate activity in anterior M2, this is our primary interest and where our electrodes are placed. The window between cue delivery and action completion hones in on our goal of investigating the role for β noradrenergic signaling in target cortical processing, rather than LC explicitly. In both NHP and rodents NE signaling (and evoked LC) promotes sustained cortical representations between cue onset and actions across cortical regions (dlPFC, S1) (Ramos & Arnsten, 2007; Vazey et al., 2018; Wang et al., 2007). In the current study we aligned neural data to either cue presentation (Figure 3) or action (lever press; Figure 4). Both presentations support a critical role for β adrenoreceptor signaling in suppressing irrelevant information, resolving and maintaining action plans. A unique feature of aligning the data to cue onset is that it allows us to see how the neural activity changes not only on completed trials (that end with a lever press) but also on omitted trials (which strongly increase after propranolol). We propose the reason we are seeing large increases in omitted trials is because β adrenoreceptor blockade either directly or indirectly prevents anterior M2 from resolving an action plan.

      Second, the analysis of neural activity around cue onset should be examined with spikes aligned on the action, since M2 is a motor region and raster plots suggest that activity is strongly related to action (I'll be more specific below).

      We agree that M2 shows important action plan activity which we highlight throughout the manuscript. In cued tasks, M2 neurons have been shown to represent action plans starting at cue onset that continues up to behavioral execution. Neural data was examined and results presented aligned to cue onset (illustrated in Figure 3) and aligned to action - lever press (illustrated in Figure 4). The impact of propranolol in diminishing action plan selection was similar in both action, and cue-aligned analyses.

      The distinction between neural activity and behavior or cognition is not always clear. I understand that spike count can be related to motor preparation or decision, but it should not be taken for granted that neuronal activity is action planning. The analysis should be clarified and the relation between neural activity, behavior, and potential hidden cognitive operations should be explicated more clearly.

      We have worked to clarify in our revised introduction, results and discussion the specifics of the known roles of neural activity in M2 in both action planning and decision making. We further expand that the neuronal activity in our study may reflect potential changes in cognitive processing and thus alter resultant behavioral outcomes.

      The sex difference is interesting, but at the moment it seems anecdotal. From a theoretical point of view, is there any ecological/ biological reason for a sex dependency of noradrenergic modulation of the cortex? Is there any background literature on sex differences in motor functions in rats, or in terms of NA action? If not, why does it matter (how does it change the way we should interpret the data?) From a practical point of view, is there a functional sex difference in absence of treatment, or is it that the drug has a distinct effect on males vs females? This has very distinct consequences, I think.

      We did not find overt differences in behavior in the absence of treatment. Only when noradrenergic function was challenged using propranolol did we identify functional sex differences. We agree that this has very distinct consequences – specifically it supports sex differences that can be revealed by perturbations of normal function. These functional sex differences may be a result of differences in the anatomy of central noradrenergic systems, a hypothesis further supported by our mRNA expression findings and existing literature on LC anatomy across species (Bangasser et al., 2011, 2016; Luque et al., 1992; Mulvey et al., 2018; Ohm et al., 1997; Pinos et al., 2001). Collectively these results have potential ramifications for understanding sex differences in disease prevalence and targeted treatments.

      Background literature supports some innate sex differences in motor function and executive function in rodents and humans. Of particular relevance to our investigation is an established difference in behavioral strategy with females being more risk averse than males (Grissom & Reyes, 2019). Ethologically risk adverse strategies may support parental care roles, and increased inhibitory mechanisms may be selected for in females. Although this strategy was not directly tested in our study, the large increase in omissions after propranolol seen in females is in line with avoiding risk (incorrect choices) during uncertainty (disrupted neural signaling). As with other executive functions, the utilization of norepinephrine within the cortex along with other neuromodulators, and local microcircuit interactions would all contribute to promoting risk averse behavior.

      These issues could be clarified both in the introduction and in the discussion, but the authors might have a different view on what is theoretically relevant here. In the result section, however, I think that both the lack of specificity in the description of behavior and cognitive operation and the confusion between 'sensory' and 'motor' functions make it very difficult to figure out what is going on in these experiments, both at a behavioral and at a neurophysiological level. First, the description of the behavior in the task is clearly not sufficient, which makes the interpretation of the measures very difficult.

      We have made an effort to better specify the task and relevant behavioral operations in both the methods and results and have included a clearer task schematic (Figure 1A). We agree that the confusion between ‘sensory’ and ‘motor’ functions may make it more difficult to understand the findings in this study. Anterior M2 plays a unique role in representing motor/action plans that can be informed by sensory information. This integrative function creates difficulty in parsing the neural activity of anterior M2 as strictly motor, sensory or cognitive. In attempts to improve clarity we have expanded and highlighted relevant information on the known roles of M2 in the introduction and discussion.

      One possible interpretation of the effects of the drug is a decrease in motivation, for instance, due to a decrease in reward sensitivity or an increase in sensitivity to effort. But there are others. More importantly, none of these measures can be used to tease apart action preparation from action execution, even though the study is supposed to be about the former.

      Neural activity during action planning, prior to action execution is known to be an essential function of M2 (Barthas & Kwan, 2017; Gremel & Costa, 2013; Guo et al., 2017; Inagaki et al., 2018, 2022; Li et al., 2016; Siniscalchi et al., 2016; Sul et al., 2011; Wei et al., 2019) for optimal performance in 2AFC tasks. In all, we found that the representation/separation of opposing action plans (a well validated function of M2) prior to responses (lever press) is degraded after propranolol, especially in females. We have provided additional emphasis on these foundational studies throughout our revised manuscript.

      To minimize impact of motivational factors, effort and reward size remain consistent within our task, and all trials require a random initiation hold prior to cue delivery. As described in our general response to the editor above (Figure 1, above), we investigated whether motivational changes may be reflected in our M2 recordings. PETHs from the first and last 10 trials within saline sessions did not identify potential motivation related differences in anterior M2 activity. Similarly, across propranolol sessions the neural activity was consistent between early and late trials. We used early and late trials as there was a mild decrease in trial rate during saline sessions in both males and females, potentially indicative of motivation/reward sensitivity changes during these sessions. M2 neural responses consistently separate action plans (after saline) or failed to separate action plans (propranolol sessions).

      Also, but this is less critical: In Figures 2C and D, it looks like there is a bimodal distribution for the effect of propranolol in females. Is there something similar in the neuronal effects of the drug? And in the distribution of receptors? Can it be accounted for by hormonal cycles/ anything else?

      Although there is some clustering in behavioral outcomes all data passed normality assumption as appropriate. Propranolol treatments were not synchronized to hormonal cycles, and the data likely include animals at various hormonal stages. Similar clustering was not apparent in neuronal effects of propranolol, although propranolol increased variability in many measures.

      In a pilot experiment we did not see any difference in baseline performance on our 2AFC task across the hormonal cycle (diestrous, proestrous, estrous or metestrous) of females in any measure including accuracy (F(3,33)=0.59, p=0.63, one-way ANOVA) and omissions (F(3,33)=0.51, p=0.68).

      The description of neural activity is also very superficial. In general, it is not clear how spike count measures have been extracted. For example, legend and figure C are not clear, is the (long) period of cue presentation included in the 'decision time'?? "Cues were presented at a variable interval 200-700ms after initiation and until animals left the well, 'Well Exit'. The time from cue onset to well exit was identified as the decision time (yellow)." Yet on the figure only the period after cue presentation is in yellow. This is critical because, given the duration of the cue, the animals are probably capable of deciding (to exit the well) before the cue turns off. Indeed, as shown in fig 2D, the animals can decide within about 500 ms. So to what extent is the 'cue response' actually a 'decision response'?

      We have clarified the task and spike count measurements in methods and added a revised task schematic. It is correct that the cues are available throughout the decision time (for up to 5 seconds or until well exit), and an action plan is generated before well exit/cues turn off as reflected by the separation of neural action plans (Fig 3, saline). Anterior M2 neurons maintain action plan representation from cue onset until the lever press under normal conditions (Fig 4, saline). These action plans encapsulate “cue responses” and “decision responses”. We have aligned neural data to discrete timestamps at either end of the window in which M2 processing is known to be critical, specifically between cues and actions (lever press) and focus on neural activity relative to those points. We refer to this activity throughout the manuscript as an ‘action plan’ as action planning functions of M2 activity have been well established in prior studies.

      When looking at figure 3A, there is clearly a pattern on the raster, a line going from top left to bottom right. If the trials are sorted chronologically, something is happening over time. If, as I suspect, trials are sorted by ascending response time, this raster is showing that what authors are calling a 'response to cues' is actually a response around action. Basically, if propranolol slows down reaction time, the spikes will be delayed from cue onset only because they remain locked to the action. Then the whole analysis and interpretation need to be reconsidered. But it might be for the best: as I mentioned earlier, recent work on LC activity has clearly emphasized its influence on motor rather than sensory processing (Poe et al, 2020).

      Figure 3A is a single neuron example, and data analyses focus on population-wide activity. Neural data is presented both aligned to cues, for all trials in which a cue was received, and aligned to lever press (action), for all trials on which a lever press occurred. In both cases, aligned to cue or aligned to action, the impact of propranolol is the same. β adrenoreceptor blockade reduces the separation of action plans in M2, severely so in females. However, a major finding is that females receive a cue but omit a large number of trials after propranolol, for this outcome the action does not occur. We propose this is due to the lack of action plan separation in anterior M2 (either directly or indirectly). When no behavioral response occurs, these trials cannot be aligned to action, yet we are still interested in the neural activity during the critical window between cue delivery and actions. We are not assigning this neural activity to sensory processing but using this discrete sensory event within our trials (cue) to align the data as there is substantial evidence that action plans in M2 arise after cue presentation in tasks such as ours where performance is guided by external cues.

      Fig 2D-F: it is hard to believe that the increase in firing rate induced by propranolol in females is not significant. Presumably, because the range of the median firing rate is so high in the first place, distribution (2E) really indicates an increase in firing. Maybe some other test? e.g paired t.test, or standardized values (z.score) to get rid of variability in firing across neurons?

      We agree that the session wide firing rate appears rightward shifted in females after propranolol. As our recordings were taken on different days, several days apart we cannot assume they are the same neurons for paired analyses. In our revised manuscript we evaluated these distributions using a MannWhitney test to increase power and decrease the impact of variability within the population. Previously we had used a Kolmogorov-Smirnov test. Using our new analysis, we can confirm that the propranolol significantly increases session wide firing rates in anterior M2 of females (p=0.027) but not males. This finding increases evidence for direct actions of propranolol within M2 and supports our hypothesis that propranolol leads to local disinhibition by reducing β noradrenergic signaling in interneurons and that without this noradrenergic tone anterior M2 is less efficient at suppressing irrelevant action plans.

      Along those lines, would it be worth looking for effects on specific populations (interneurons) which are sometimes characterized by thinner spikes and higher mean firing rates? Given the distribution of beta receptors RNA on interneurons, one would actually expect an effect of propranolol on the firing rate irrespective of task events. Or what is it that prevents the influence of propranolol on interneurons from changing the firing rate? In any case, one of the strengths of this study is the localization of beta receptors on specific neuronal populations in the cortex, so I think that the authors should really try to build on it and find something related to the neurophysiological effects. Otherwise, one cannot exclude the possibility that the behavioral effects are not related to the influence of the drug on these receptors in that region.

      Data were collected using stainless steel electrode arrays and our sample population of task related neurons is likely biased to pyramidal neurons, with a small number of fast spiking interneurons. We used validated spike waveform parameters of interneurons in premotor cortex (peak-to-trough ratio and duration; Giordano et al., 2023) in an attempt to isolate putative interneurons and found only a very small number of these cells in our recordings (n=5-7 per group). This population is too small to make any inferences about specific impacts. We have focused on the collective population activity of M2 as this is most strongly related to optimal action planning.

      You are correct that from the given findings we cannot conclusively show that the results found here are a result of propranolol acting solely within anterior M2. We have made sure to clarify throughout our revised manuscript that the behavioral and physiological changes we identified are a result of collective direct and indirect actions of propranolol.

      The conclusion that neuronal discrimination decreases because the proportion of neurons showing no effect increases is confusing (negative results, basically). It would be clearer if they were reporting the number of neurons that do show an effect, and presumably that this number shows a significant decrease.

      The reviewer is correct that the number of neurons that do show an effect (task related activity) does significantly decrease with propranolol (from n=70 to 27 in females and n=71 to 48 in males). These n are now given adjacent to the proportions rather than at the end of the paragraph. Proportions were used for statistical analysis due to an overall decrease in the total number of units after propranolol. All PETH presented are from neurons that show some task related activity, these PETH confirm that neural activity no longer effectively discriminates/separates action plans in M2.

      Figs 3F-I: a good proportion of neurons (at least 20%) show a significant encoding before cue onset. How is it possible? This raises the issue of noise level/ null hypothesis for this kind of repeated analysis. How did the author correct for multiple comparison issues?

      In response to reviews, we have altered the manner in which we identify the significantly modulated neurons to increase rigor and no longer include these figures or analyses. The proportion of neurons showing action plan encoding prior to cue onset was likely an artifact of how the data was analyzed and an insufficient correction for multiple comparisons, allowing inclusion of internally generated action plans in some neurons.

      The description of the action-related activity is globally confusing. Again, how can the authors discriminate between activity related to planning vs action itself? What is significant and what is not, in males vs females? What is being measured here? For example, a very unclear statement on line 238: "Propranolol primarily disrupted active inhibition of irrelevant action selection in M2 activity, reducing the ability to maintain action plan representation in M2, delaying lever press responses (Figure 4L, 4M)." What is 'active inhibition? What is an irrelevant action plan? What is selection? All of that should be defined using objective behavioral criteria and tested formally.

      We have changed our wording to clarify what we are describing and why we have chosen the words we have, and to ensure consistency and objectivity throughout the manuscript. Much of the wording we have used – for example action planning or action plan selection, are the words used in the literature to describe M2 neural activity. We call the activity in M2 action planning (either externally/cue guided or internally guided) because that is what has been previously demonstrated. In our task design and analysis we are tracking cue guided actions, as opposed to internally guided.

      We also separate the electrophysiology data as preferred and nonpreferred because the literature has shown individual M2 neurons show specific directional tuning as noted in our results, using the term ‘preferred’ encapsulates that tuning regardless of left/right direction. An example M2 neuron that increases activity for left cues and responses (preferred direction), will show active inhibition (low/negative z scores) on trials with right cues and responses (nonpreferred), other neurons would show the inverse relationship with direction.

      A primary impact of propranolol was the loss of negative z-scores for nonpreferred trials ie neurons with a left preference that are usually inhibited on right trials were still firing and vice-versa. After propranolol neurons continue to fire for an irrelevant action plan (for the opposite direction), and the resulting population activity is not significantly different for opposing cues/responses. Behavioral responses normally occur after opposing action plans have significantly separated in M2, collapsing action plans by preventing relevant signaling (Guo et al., 2017; Inagaki et al., 2018; Li et al., 2016) or facilitating irrelevant signaling as we see here with propranolol leads impairments in 2AFC performance.

      Also, the description of the classifier analysis should be more thorough. Referencing the toolbox is not sufficient to understand what has been done.

      We have added additional explanation in both the methods and description of the results to clarify the functions of the neural decoding box and how we are using it to evaluate information encoding within M2. We have provided detail on how the algorithm was trained, how shuffled data was generated and how we determined significance of decoding accuracy.

      Measuring Beta adrenoceptors is a great idea, and the results are interesting, especially the difference between neuron types. But again, how does that fit with neurophysiological results? Note, that since this is RNA measures, it should not be phrased as 'receptors' but 'receptors RNA' throughout. One possible interpretation of these anatomical results that cannot be reconciled with physiology is that protein expression at the membrane shows a distinct pattern.

      We have changed the references to β receptor expression to β receptor mRNA expression throughout the manuscript. Although mRNA provides a valuable proxy for adrenoreceptor production, as noted by the reviewer protein expression at the membrane may differ. Reliable antibodies that allow quantitative analysis of membrane bound adrenoreceoptors in situ with co-labeling of specific cell types are limited. The goal of assessing mRNA expression within M2 was to determine if the functional sex differences we identified in M2 neurophysiology when manipulating β adrenoreceptor function could be mediated by basal differences in adrenoreceptors. The causal impact of differential mRNA expression in anterior M2 was not directly tested but our findings provide preliminary evidence that adrenoreceptor regulation may differ across sexes. Our results provide a plausible avenue for differential sensitivity to β adrenoreceptor manipulation across sexes, that may also be found in other brain regions.

      In conclusion, I think that this is a very interesting study and that the results are potentially relevant for a wide audience. But the paper would clearly benefit from revisions. If the authors could clearly identify a significant relationship between the action of NA on beta receptors on specific cortical neurons, at a physiological and behavioral level, that would be a seminal study. At the moment, the evidence is not convincing enough but the data suggest that it is the case.

      We thank the reviewer for the kind remarks. We have undertaken a number of new analyses, refined existing analysis and clarified our claims in the manuscript to improve rigor. Collectively our data reflect that the behavioral and neural deficits after systemic propranolol are likely due to both direct and indirect actions on M2. We believe this work is compelling and that it will inform future work investigating potential sex differences in central noradrenergic anatomy and functional sex differences after perturbations of noradrenergic signaling.

    2. eLife assessment

      Rodberg et al. show systemic β adrenergic antagonism reduces engagement in decision-making, particularly in female rats, and reduces task-related encoding in neural activity. This is a valuable finding that addresses a gap in the field, however, the understanding of the direct contribution of β adrenergic receptors to the observed effects is incomplete. Further, the theoretical grounds, data analyses, and results could be improved in several ways.

    3. Reviewer #1 (Public Review):

      Various parts of the premotor cortex have been implicated in choices underlying decision-making tasks. Further, norepinephrine has been implicated in modulating behavior during various decision-making tasks. Less work has been done on how noradrenergic modulation would affect M2 activity to alter decision-making, nor is it clear whether noradrenergic modulation effects on activity would differ between the male and female sexes.

      This manuscript addresses some of these questions.<br /> - In particular, clear sex differences in task engagement are seen.<br /> - May also show some interesting differences and distributions of β2 adrenergic receptors in M2 between males and females.

      Less clear is the specificity of systemic antagonism of β adrenergic receptors on the changes in M2 activity reported. As propranolol was given systemically, changes in M2 firing rates could also be due to broader circuit (indirect) activity changes. As it was not given locally, nor were local receptor populations manipulated, one is unable to make the conclusion that changes in neural activity are due to the direct effects of adrenergic receptors within M2 populations.

      Also not clear, is the contribution of M2 to this task, and whether the changes in M2 activity patterns observed are directly responsible for the behavioral disruptions measured.

    4. Reviewer #2 (Public Review):

      This paper by Rodbarg et al describes an interesting study on the role of beta noradrenergic receptors in action-related activity in the premotor cortex of behaving rats. This work is precious because even if the action of neuromodulatory systems in the cortex is thought to be critical for cognition, there is very little data to actually substantiate the theories. The study is well conducted and the paper is well written. I think, however, that the paper could benefit from several modifications since I can see 3 major issues:

      Both from a theoretical and from a practical point of view, the emphasis on 'cue-related' activity and the potential influence of NA on sensory processing is problematic. First, recent studies in rodents and primates have clearly demonstrated that LC activation is more closely related to actions than to stimulus processing (see Poe et al, 2020 for review). Second, the analysis of neural activity around cue onset should be examined with spikes aligned on the action, since M2 is a motor region and raster plots suggest that activity is strongly related to action (I'll be more specific below).

      The distinction between neural activity and behavior or cognition is not always clear. I understand that spike count can be related to motor preparation or decision, but it should not be taken for granted that neuronal activity is action planning. The analysis should be clarified and the relation between neural activity, behavior, and potential hidden cognitive operations should be explicated more clearly.<br /> The sex difference is interesting, but at the moment it seems anecdotal. From a theoretical point of view, is there any ecological/ biological reason for a sex dependency of noradrenergic modulation of the cortex? Is there any background literature on sex differences in motor functions in rats, or in terms of NA action? If not, why does it matter (how does it change the way we should interpret the data?) From a practical point of view, is there a functional sex difference in absence of treatment, or is it that the drug has a distinct effect on males vs females? This has very distinct consequences, I think.

      These issues could be clarified both in the introduction and in the discussion, but the authors might have a different view on what is theoretically relevant here. In the result section, however, I think that both the lack of specificity in the description of behavior and cognitive operation and the confusion between 'sensory' and 'motor' functions make it very difficult to figure out what is going on in these experiments, both at a behavioral and at a neurophysiological level.

      First, the description of the behavior in the task is clearly not sufficient, which makes the interpretation of the measures very difficult. One possible interpretation of the effects of the drug is a decrease in motivation, for instance, due to a decrease in reward sensitivity or an increase in sensitivity to effort. But there are others. More importantly, none of these measures can be used to tease apart action preparation from action execution, even though the study is supposed to be about the former.<br /> Also, but this is less critical: In Figures 2C and D, it looks like there is a bimodal distribution for the effect of propranolol in females. Is there something similar in the neuronal effects of the drug? And in the distribution of receptors? Can it be accounted for by hormonal cycles/ anything else?

      The description of neural activity is also very superficial.<br /> In general, it is not clear how spike count measures have been extracted. For example, legend and figure C are not clear, is the (long) period of cue presentation included in the 'decision time'?? "Cues were presented at a variable interval 200-700ms after initiation and until animals left the well, 'Well Exit'. The time from cue onset to well exit was identified as the decision time (yellow)." Yet on the figure only the period after cue presentation is in yellow. This is critical because, given the duration of the cue, the animals are probably capable of deciding (to exit the well) before the cue turns off. Indeed, as shown in fig 2D, the animals can decide within about 500 ms. So to what extent is the 'cue response' actually a 'decision response'? When looking at figure 3A, there is clearly a pattern on the raster, a line going from top left to bottom right. If the trials are sorted chronologically, something is happening over time. If, as I suspect, trials are sorted by ascending response time, this raster is showing that what authors are calling a 'response to cues' is actually a response around action. Basically, if propranolol slows down reaction time, the spikes will be delayed from cue onset only because they remain locked to the action. Then the whole analysis and interpretation need to be reconsidered. But it might be for the best: as I mentioned earlier, recent work on LC activity has clearly emphasized its influence on motor rather than sensory processing (Poe et al, 2020).

      Fig 2D-F: it is hard to believe that the increase in firing rate induced by propranolol in females is not significant. Presumably, because the range of the median firing rate is so high in the first place, distribution (2E) really indicates an increase in firing. Maybe some other test? e.g paired t.test, or standardized values (z.score) to get rid of variability in firing across neurons?

      Along those lines, would it be worth looking for effects on specific populations (interneurons) which are sometimes characterized by thinner spikes and higher mean firing rates? Given the distribution of beta receptors RNA on interneurons, one would actually expect an effect of propranolol on the firing rate irrespective of task events. Or what is it that prevents the influence of propranolol on interneurons from changing the firing rate? In any case, one of the strengths of this study is the localization of beta receptors on specific neuronal populations in the cortex, so I think that the authors should really try to build on it and find something related to the neurophysiological effects. Otherwise, one cannot exclude the possibility that the behavioral effects are not related to the influence of the drug on these receptors in that region.

      The conclusion that neuronal discrimination decreases because the proportion of neurons showing no effect increases is confusing (negative results, basically). It would be clearer if they were reporting the number of neurons that do show an effect, and presumably that this number shows a significant decrease.<br /> Figs 3F-I: a good proportion of neurons (at least 20%) show a significant encoding before cue onset. How is it possible? This raises the issue of noise level/ null hypothesis for this kind of repeated analysis. How did the author correct for multiple comparison issues?<br /> The description of the action-related activity is globally confusing. Again, how can the authors discriminate between activity related to planning vs action itself? What is significant and what is not, in males vs females? What is being measured here? For example, a very unclear statement on line 238: "Propranolol primarily disrupted active inhibition of irrelevant action selection in M2 activity, reducing the ability to maintain action plan representation in M2, delaying lever press responses (Figure 4L, 4M)." What is 'active inhibition? What is an irrelevant action plan? What is selection? All of that should be defined using objective behavioral criteria and tested formally.<br /> Also, the description of the classifier analysis should be more thorough. Referencing the toolbox is not sufficient to understand what has been done.<br /> Measuring Beta adrenoceptors is a great idea, and the results are interesting, especially the difference between neuron types. But again, how does that fit with neurophysiological results? Note, that since this is RNA measures, it should not be phrased as 'receptors' but 'receptors RNA' throughout. One possible interpretation of these anatomical results that cannot be reconciled with physiology is that protein expression at the membrane shows a distinct pattern.

      In conclusion, I think that this is a very interesting study and that the results are potentially relevant for a wide audience. But the paper would clearly benefit from revisions. If the authors could clearly identify a significant relationship between the action of NA on beta receptors on specific cortical neurons, at a physiological and behavioral level, that would be a seminal study. At the moment, the evidence is not convincing enough but the data suggest that it is the case.

    1. Author Response

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      (1) What's the rationale of trypsinizing the tissue prior to mitochondrial isolation? This is not standard for subsequent proteomics analysis. This step will inevitably cause protein loss, especially for the post mitochondrial fractions (PMF). Treating samples with 0.01ug/uL trypsin for 37oC 30 min is sufficient to partially digest a substantial portion of the proteome. If samples from different subjects were not of the same weight, then this partial digestion step may introduce artificial variability as variable proportions of proteins from different subjects would be lost during this step. In addition, the mitochondrial protein enrichment in the mito fraction, despite statistically significant, does not look striking (Figure 1E, ~30% mitochondrial proteins in the mito fraction). As a comparison, Williams et al., MCP 2018 seem to have obtained high mitochondrial protein content in the mito fraction without trpsinizing the frozen quadriceps using a similar SWATH-MS-based approach.

      Trypsinisation of the tissue prior to mitochondrial isolation is based on previous work and a Nature Protocol (1, 2) which isolated mitochondria for skeletal muscle. The rationale is that it aids in mechanical homogenisation from highly fibrous tissues such as quadriceps muscle by digesting extracellular matrix proteins. The trypsin/protein ratio used to aid in this process is at least 400 times lower than the amount of trypsin used for formal proteomic tryptic digestion. Three pieces of evidence suggest this step has negligible effect on downstream proteomic analysis. First, because the trypsinisation buffer is detergent free, trypsin will only affect extracellular or exposed membrane proteins. Filtering our PMF dataset for proteins with ‘extracellular matrix’ gene ontology identifies at least 90 unique extracellular matrix proteins indicating good retention of proteins susceptible to partial digestion. Second, the trypsin dose used is 50 times lower than the concentration used for passaging cultured cells, which retain viability after trypsinisation. Third, and contrary to the point raised by the reviewer, we observe less missingness in PMF samples compared to mitochondrial samples. We thank the reviewer for bringing the Williams et al. 2018 MCP paper to our attention. We note that mitochondrial enrichment between the two papers is comparable (~2- fold). To improve clarity line 408 now reads: “Whole quadriceps muscle samples were prepared as previously described with modification (99, 100). First, tissue was snap frozen with liquid nitrogen…” and line 95 reads: “Mitochondrial proteins were defined based on their presence in MitoCarta 3.0 (24) and consistent with previous work (25) were approximately two-fold enriched in the mitochondrial fraction relative to the PMF (Fig 1E).”

      (2) The authors mentioned that the proteomics data were Log2 transformed and median- normalized. Would it be possible to provide a bit more details on this? Were the subjects randomized?

      Samples were randomised prior to sample processing and mass spectrometry analysis. Because of possible variation in total protein content, it is critical to normalise protein intensities between samples. Median normalisation adjusts the samples so that they have the same median, thereby accounting for technical variation. Log2 normalisation helps to achieve normal distributions, critical for many downstream statistical tests. Line 471 now reads: “…to achieve normal distributions and account for technical variation in total protein.”

      (3) In Figure 1D, what were the numbers of mice the authors used for the CV comparisons in each group? Were they of similar age and sex? Were the differences in CV values statistically significant?

      The mitochondrial and PMF proteomes originated from the same quadriceps sample from the same mouse, and thus the age and sex are the same across both proteomes. After quality control, we had mitochondrial proteomes for 194 mice and PMF proteomes for 215 mice. The overall CV in the mitochondrial fraction was significantly greater than in the PMF, however whether the source of this variation is biological, or the result of mitochondrial isolation is unclear and as such we have avoided making a statement within the body of the manuscript. We have now more clearly described the nature of the samples in the revised manuscript and added sample sizes to figure 1F.

      (4) The authors stated in lines 155-157 that proteins negatively associated with the Matsuda index were further filtered by presence of their cis-pQTLs. Perhaps more explanations would be needed to justify this filtering criterion? Having a cis-pQTL would mean the protein abundance variation is explained by the variation in its coding gene, this however conceptually would not be relevant to its association with the Matsuda index. With the data that the authors have in hand, would it not be natural to align the Matsuda index QTL with the pQTLs (cis and trans if available), and/or to perform mediation analysis to examine causal relationships with statistical significance?

      The rationale for filtering by cis-pQTL was not to study the genetics of either Matsuda or associated proteins but rather to identify proteins that were more likely to be causally associated with Matsuda Index as opposed to adaptively associated. To clarify this line 165 now reads: “Filtering based on cis-pQTL presence was based on the rationale that if genetic variation can explain protein abundance differences between mice, then we can be confident that phenotype (Matsuda Index) is not driving the observed differences and therefore the protein-phenotype associations are likely causal. Importantly, this assumption can only be made for cis-acting pQTLs.” Previous work by Matthew et al. (see https://qtlviewer.jax.org/) has demonstrated that cis-pQTL have markedly higher LOD scores than trans-pQTLs, and our own unpublished work suggests that trans-pQTLs do not reproduce well between datasets. The reviewer rightfully suggests aligning protein QTL with those for Matsuda. This is our long-term goal but to identify genome wide significant peaks associated with altered Matsuda will require many more mice than studied here.

      (5) It seems a bit odd that the first half of the paper focused extensively on the authors' discoveries in the mitochondrial proteome, and how proteins involved in mitochondrial processes (such as complex I) were associated with Matsuda Index, but the final fingerprint list of insulin resistance, which contained 76 proteins, only had 7 mitochondrial proteins. Was this because many mitochondrial proteins were filtered out due to no cis-pQTL presenting?

      There are three reasons our fingerprint is lacking mitochondrial proteins: 1) there are more non-mitochondrial than mitochondrial proteins in the muscle proteome; 2) we focussed on negatively associated proteins, and as demonstrated in figure 2c, the mitochondrial proteome is enriched for positively associated proteins; 3) as implied by the reviewer, we filtered for pQTL presence, further reducing the number of mitochondrial proteins in our fingerprint. To improve clarity, line 170 now reads: “Low mitochondrial representation in the fingerprint is the result of selecting negatively associating proteins, and as seen (Figure 2C) previously, the mitochondrial proteome is enriched for positive contributors to insulin resistance.”

      (6) The authors found that thiostrepton-induced insulin resistance reversal effects were not through insulin signalling. It activated glycolysis but the mechanism of action was not clear. What are the proteins in the fingerprint list that led to identification of thiostrepton on CMAP?

      Is thiostrepton able to bind or change the expression of these proteins? Since thiostrepton was identified by searching the insulin resistance fingerprint protein list against CMAP, it would be rational to think that it exerts the biological effects by directly or indirectly acting on these protein targets.

      This is indeed the implication of our data. Because of the timescales involved it is unlikely that thiostrepton is changing fingerprint protein levels but could be binding to and inhibiting them. Searching the CMAP thiostrepton signature reveals ARHGDIB and NAGK as the fingerprint proteins with the most positive and negative fold-changes respectively perhaps suggesting they play a role in thiostrepton’s mechanism of action. Experiments are underway to test this hypothesis however these are beyond the scope of the current paper.

      Reviewer #2 (Public Review):

      Line 105: The observation that variance in respiratory proteins is stable while lipid pathways is variable is quite interesting. Is this due to lower overall levels of lipid metabolism enzymes (ex. do these differ substantially from similar pathways ranked from high-low abundance?).

      The relationship between coefficient of variation (CV) and relative abundance of proteins is important to consider. To address this, we have now also performed GSEA on proteins ranked from high to low relative abundance. These comparisons have been added to supplementary figure 1 and line 110 now reads: “As a control experiment, we also performed enrichment analysis on proteins ranked by LFQ relative abundance. High CV pathways (enriched for high CV proteins) tended to be lower in relative abundance (enriched for low relative abundance proteins) (Supplementary Fig 1a, b). However, many high variability pathways, lipid metabolism for example, were not enriched in either direction based on relative abundance suggesting differences in relative abundance do not fully explain pathway variability differences.”

      Line 154: the 664 associations are impressive and potentially informative. It would be valuable to know which of these co-map to the same locus - either to distinguish linkage in a 2mb window or identify any cis-proteins which directly exert effects in trans-

      To assess this, we have analysed pQTL position relative to gene position to generate a ‘hotspot’ plot. We have also generated a histogram of this pQTL density (in a 2 Mbp window) and added these figures to figure 3. We did not detect any obvious pQTL hotspots, and the distribution of pQTLs across the genome appears fairly uniform. Line 159 now reads: “These were distributed across the genome and were predominately cis acting (Figure 3A)...”

      Line 194: Cross-platform validation of the CMAP fingerprint results is an admirable set of validations. It might be good to know general parameters like how many compounds were shared/unique for each platform. Also the concordance between ranking scores for significant and shared compounds.

      The Connectivity Map (CMap) query included 5163 compounds, the Prestwick library included 1120, and the overlap was 420. We have added these comparisons to supplementary figure 2. Supplementary figure 2 now also contains a comparison of CMap scores between overlapping compounds (found in CMap and the Prestwick library) against all significant compounds identified by CMap (supplementary figure 2b). Interestingly, compounds present in both platforms scored higher on average, suggesting the Prestwick library captures a significant proportion of highly scoring CMap candidates. Line 206 now reads: “In total, 420 compounds were found across both platforms, and these consensus compounds captured a significant proportion of highly scoring CMap compounds (Supplementary Figure 2A, B).”

      Line 319: Another consideration in the molecular fingerprint is how unique these are for muscle. While studies evaluating gene expression have shown that many cis-eQTLs are shared across tissues, to my knowledge, this hasn't been performed systematically for pQTLs. Therefore, consider adding a point to the discussion pointing out that some of the proteins might be conserved pQTLs whereas others which would be more relevant here present unique druggable targets in muscle.

      To examine tissue specificity, we determined whether our skeletal muscle fingerprint proteins were detected and contained a pQTL in two metabolically important tissues, liver and adipose. Despite detecting almost all the fingerprint proteins in both adipose and liver tissue, they were depleted for pQTL compared to skeletal muscle. These data have now been added to figure 3c. Line 172 now reads: “To assess the tissue specificity of our fingerprint we searched for the same proteins in metabolically important adipose and liver tissues. Despite detecting 94% and 82% of muscle fingerprint proteins across each tissue respectively, both adipose and liver were depleted for pQTL presence (Figure 3C) suggesting that regulation of our fingerprint protein abundance is specific to skeletal muscle.”

      Line 332: These are fascinating observations. 1, that in general insulin signaling and ampk were not themselves shown as top-ranked enrichments with matsuda and that this was sufficient to alter glucose metabolism without changes in these pathways. While further characterization of this signaling mechanism is beyond the scope of this study, it would be good to speculate as to additional signaling pathways that are relevant beyond ROS (ex. CNYP2 and others)

      We have now added further discussion to the manuscript to address this point., Line 347 now reads: “Aside from glycolysis, other pathways may be involved in enhancing insulin sensitivity. For example, the negatively associated protein ARHGDIA (Figure 2F) is a potent negative regulator of insulin sensitivity, and our fingerprint of insulin resistance contained its homologue ARHGDIB. Both ARHGDIA and ARHGDIB have been reported to inhibit the insulin action regulator RAC1 thus lowering GLUT4 translocation and glucose uptake. Further investigations may uncover a role for thiostrepton in modulating the RAC1 signalling pathway via ARHGDIB.”

      Line: 314: Remove the statement: "While this approach is less powerful than QTL co- localisation for identifying causal drivers,", as I don't believe that this has been demonstrated. Clearly, the authors provide a sufficient framework to pinpoint causality and produce an actionable set of proteins.

      We have edited line 314, which now reads: “Moreover, our approach has the major advantage that it requires far fewer mice to obtain meaningful outcomes (222 mice in this study) compared to that required for genetic mapping of complex traits like Matsuda Index.”

      Line 346: I would highlight one more appeal of the approach adopted by the authors. Given that these compound libraries were prioritized from patterns of diverse genetics, these observations are inherently more-likely to operate robustly across target backgrounds.

      This point is further supported by our thiostrepton results in both C57BL6/j and BXH9 mice. Line 317 now reads: “Furthermore, because we have used genetically diverse datasets (DOz mice and multiple cell lines in Connectivity Map) our findings are likely robust across diverse target backgrounds.”

      Line 434: I might have missed but can't seem to find where the muscle data are available to researchers. Given the importance and novelty of these studies, it will be important to provide some way to access the proteomic data.

      These data are now available via the ProteomeXchange Consortium. Line 465 now reads: “The mass spectrometry proteomics data have been deposited to the ProteomeXchange Consortium via the PRIDE (104) partner repository with the dataset identifier PXD042277.”

      1. Frezza C, Cipolat S, Scorrano L. Organelle isolation: functional mitochondria from mouse liver, muscle and cultured filroblasts. Nat Protoc. 2007;2(2):287-95.

      2. Acin-Perez R, Benador IY, Petcherski A, Veliova M, Benavides GA, Lagarrigue S, et al. A novel approach to measure mitochondrial respiration in frozen biological samples. The EMBO Journal. 2020;39(13):e104073.

      3. Chick JM, Munger SC, Simecek P, Huttlin EL, Choi K, Gatti DM, et al. Defining the consequences of genetic variation on a proteome-wide scale. Nature. 2016;534(7608):500- 5.

      4. Gatti DM, Svenson KL, Shabalin A, Wu L-Y, Valdar W, Simecek P, et al. Quantitative Trait Locus Mapping Methods for Diversity Outbred Mice. G3 Genes|Genomes|Genetics. 2014;4(9):1623-33.

    2. Reviewer #1 (Public Review):

      Masson et al. leveraged the natural genetic diversity presented in a large cohort of the Diversity Outbred in Australia (DOz) mice (n=215) to determine skeletal muscle proteins that were associated with insulin sensitivity. The hits were further filtered by pQTL analysis to construct a proteome fingerprint for insulin resistance. These proteins were then searched against Connectivity Map (CMAP) to identify compounds that could modulate insulin sensitivity. In parallel, many of these compounds were screened experimentally alongside other compounds in the Prestwick library to independently validate some of the compound hits. These two analyses were combined to score for compounds that would potentially reverse insulin resistance. Thiostrepton was identified as the top candidate, and its ability to reverse insulin resistance was validated using assays in L6 myotubes.

      Below are several comments made on the original version of this study, addressed by the authors in the current version:

      (1) Please describe the rationale of trypsinizing the tissue prior to mitochondrial isolation.

      (2) The authors mentioned that the proteomics data were Log2 transformed and median-normalized. Please provide a bit more details on this, including whether the subjects were randomized.

      (3) In Figure 1D, please give the numbers of mice the authors used for the CV comparisons in each group, whether they were of similar age and sex, and whether the differences in CV values were statistically significant

      (4) The authors stated in lines 155-157 that proteins negatively associated with Matsuda index were further filtered by presence of their cis-pQTLs. Please provide more explanations to justify this filtering criterion.

      (5) Please explain why the first half of the paper focused extensively on the authors' discoveries in the mitochondrial proteome, and how proteins involved in mitochondrial processes (such as complex I) were associated with Matsuda Index, but the final fingerprint list of insulin resistance, which contained 76 proteins, only had 7 mitochondrial proteins.

      (6) The authors found that thiostrepton-induced insulin resistance reversal effects were not through insulin signalling. Please list the proteins in the fingerprint list that led to identification of thiostrepton on CMAP, and discuss whether you think that thiostrepton directly or indirectly acts on these protein targets.

    3. Reviewer #2 (Public Review):

      In the present study, Masson et al. provide an elegant and profound demonstration of utilization of systems genetics data to fuel discovery of actionable therapeutics. The strengths of the study are many: generation of a novel skeletal muscle genetics proteomic dataset which is paired with measures of glucose metabolism in mice, systematic utilization of these data to yield potential therapeutic molecules which target insulin resistance, cross-referencing library screens from connectivity map with an independent validation platform for muscle glucose uptake and preclinical data supporting a new mechanism for thiostrepton in alleviating muscle insulin resistance. Future studies evaluating similar integrations of omics data from genetic diversity with compound screens, as well as detailed characterization of mechanisms such as thiostrepton on muscle fibers will further inform some remaining questions. In general, the thorough nature of this study not only provides strong support for the conclusions made but additionally offers a new framework for analysis of systems-based data. I had made several comments on the prior submission, all of which have been fully addressed and incorporated.

    1. eLife assessment

      This important study illustrates the value of museum samples for understanding past genetic variability in the genomes of populations and species, including those that no longer exist. The authors present genomic sequencing data for the extinct Xerces Blue butterfly and report convincing evidence of declining population sizes and increases in inbreeding beginning 75,000 years ago, which strongly contrasts to the patterns observed in similar data from its closest relative, the extant Silvery Blue butterfly. Such long-term population health indicators may be useful for highlighting still extant but especially vulnerable-to-extinction insect species -- irrespective of their current census population size abundance.

    2. Reviewer #1 (Public Review):

      The authors report a study, where they have sequenced whole genomes of four individuals of an extinct species of butterfly from western North America (Glaucopsyche xerces), along with seven genomes of a closely related species (Glaucopsyche lygdamus), mainly from museum specimens, several to many decades old. They then compare these fragmented genomes to a high-quality, chromosome-level assembly of a genome of a European species in the same genus (Glaucopsyche alexis). They find that the extinct species shows clear signs of declining population sizes since the last glacial period and an increase in inbreeding, perhaps exacerbating the low viability of the populations and contributing to the extinction of the species.

      The study really highlights how museum specimens can be used to understand the genetic variability of populations and species in the past, up to a century or more ago. This is an incredibly valuable tool, and can potentially help us to quickly identify whether current populations of rare and declining species are in danger due to inbreeding, or whether at least their genetic integrity is in good condition and other factors need to be prioritised in their conservation. In the case of extinct species, sequencing museum specimens is really our only window into the dynamics of genomic variability prior to extinction, and such information can help us understand how genetic variation is related to extinction.

      I think the authors have achieved their goal admirably, they have used a careful approach to mapping their genomic reads to a related species with a high-quality genome assembly. They might miss out on some interesting genetic information in the unmapped reads, but by and large, they have captured the essential information on genetic variability within their mapped reads. Their conclusions on the lower genetic variability in the extinct species are sound, and they convincingly show that Glaucopyche xerces is a separate species to Glaucopsyche lygdamus (this has been debated in the past).

    3. Reviewer #2 (Public Review):

      The Xerces Blue is an iconic species, now extinct, that is a symbol for invertebrate conservation. Using genomic sequencing of century-old specimens of the Xerces Blue and its closest living relatives, the authors hypothesize about possible genetic indicators of the species' demise. Although the limited range and habitat destruction are the most likely culprits, it is possible that some natural reasons have been brewing to bring this species closer to extinction.

      The importance of this study is in its generality and applicability to any other invertebrate species. The authors find that low effective population size, high inbreeding (for tens of thousands of years), and higher fraction of deleterious alleles characterize the Xerces colonies prior to extinction. These signatures can be captured from comparative genomic analysis of any target species to evaluate its population health.

      It should be noted that it remains unclear if these genomic signatures are indeed predictive of extinction, or populations can bounce back given certain conditions and increase their genetic diversity somehow.

      Methods are detailed and explained well, and the study could be replicated. I think this is a solid piece of work. Interested researchers can apply these methods to their chosen species and eventually, we will assemble datasets to study extinction process in many species to learn some general rules.

      Several small questions/suggestions:

      1) The authors reference a study concluding that Shijimiaeoides is Glaucopsyche. Their tree shows the same, confirming previous publications. And yet they still use Shijimiaeoides, which is confusing. Why not use Glaucopsyche for all these blues?

      2) Plebejus argus is a species much more distant from P. melissa than Plebejus anna (anna and melissa are really very close to each other), and yet their tree shows the opposite. What is the problem? Misidentification? Errors in phylogenetic analyses?

      3) Wouldn't it be nicer to show the underside of butterfly pictures that reveals the differences between xerces and others? Now, they all look blue and like one species, no real difference.

      4) The authors stated that one of five xerces specimens failed to sequence, and yet they show 5 specimens in the tree. Was the extra specimen taken from GenBank?

    1. Author Response

      Reviewer #1 (Public Review):

      In this study, the authors set out to investigate spatial RNA processing events, specifically alternative splicing and 3' UTR usage, in mouse brain and kidney tissues using ReadZS and SpliZ methodologies on spatial transcriptomics data. The research contributes to understanding tissue-specific gene expression regulation from a spatial perspective. The study introduces a novel approach for analyzing spatial transcriptomics data, allowing for the identification of RNA processing and regulation patterns directly from 10X Visium data. The authors present convincing evidence supporting the identification of novel RNA processing patterns using their methodology, which holds significant implications for researchers in the field of spatial transcriptomics and the study of alternative splicing and 3' UTR usage.

      Thank you for this thorough overview of our work.

      The conclusions of the study are mostly well-supported by the data; however, certain aspects could be improved to strengthen the findings.

      1) The conclusions of this study would be strengthened by conducting a more extensive tissue sample analysis and including biological replicates. Additionally, appropriate batch effect corrections should be applied when dealing with biological replicates.

      We agree that including biological replicates would strengthen our findings. We will include biological replicates of the mouse brain tissues in the revision.

      2) The 3' UTR usage and alternative splicing should be compared among clearly labeled clusters for a more comprehensive analysis.

      We understand that it can be difficult to see how the SpliZ quantiles map spatially onto the tissue images. For the splicing of Gng13, Myl6, and Rps24, we will include box plots broken down by spatial quadrant in the revision. However, this does result in an oversimplification of the spatial patterns found in the tissue slices, which make the plots less informative than the quantile plots to our view.

      3) The authors should clarify their rationale for choosing ReadZS and SpliZ approaches and provide comparisons with other methods to demonstrate the advantages and potential limitations of their chosen methodologies.

      Thank you for pointing out the lack of sufficient discussion of ReadZS and SpliZ in the manuscript. The ReadZS and SpliZ were chosen for this analysis because both of these methods provide an individual score for each cell-gene pair, which is easily adapted to providing a score for each spot-gene pair. Due to the sparsity and 3’ bias of Visium data, approaches designed to analyze RNA processing in full-length sequencing analysis are not applicable. The SpliZ and ReadZS are two of the limited number of tools available that are designed for the analysis of RNA processing in droplet-based data. Other available tools tend to rely on aggregating data across multiple cells using a method called pseudo-bulking (Li et al., 2021; Patrick et al., 2020). It is not clear how this could be used for spatial transcriptomics data without potentially obscuring subtle spatial patterns in the data. Others are based on PSI measurements, which are vulnerable to artifacts due to sparsity (Buen Abad Najar et al., 2020; Olivieri et al., 2022; Wen et al., 2022). The tradeoff between pseudo-bulking and a single score per spot-gene pair means that the ReadZS and SpliZ do not have the power to detect changes for genes with very low read counts. We will add text in the revision to clarify this point.

      Reviewer #2 (Public Review):

      The authors applied existing ReadZS and the SpliZ methods, previously developed to analyze RNA process in scRNA-seq data, to Visium data to study spatial splicing and RNA processing events in tissues by Moran's I. The authors showed several example genes in mouse brain and kidney, whose processing are spatially regulated, such as Rps24, Myl6, Gng13.

      Thank you for this thorough overview of our work.

      The paper touches on an important question in RNA biology about how RNA processing is regulated spatially. Both experimental and computational challenges remain to address it. Despite some potentially interesting findings, most claims remain to be validated by orthogonal methods such as RNA FISH and simulations.

      We appreciate that the reviewer finds the question important, and that the findings are potentially interesting. In the revision we will include biological replicates for our findings in the mouse brain. Unfortunately, experimental validation is outside of our budget for this project. It is unclear what further simulations could validate the biological discoveries in this manuscript: permutations were used to calculate the p value of each discovery, and the false positive and negative rates of the SpliZ have been assessed through simulation (Olivieri et al., 2022).

      In addition, the percentage of spatial processing events (splicing in 0.8-2.2% of detected genes, i.e. 8-17 genes and RNA processing in 1.1-5.5% of detected genomic windows, i.e. 57-161 windows) discovered is low. Does it suggest that most of RNA processing events were not spatially regulated across the tissue? Or does it question the assumption of treating spatial transcriptomics data similar to scRNA-seq data?

      We agree that the question of the prevalence of spatial RNA processing regulation is critical. Rather than the two options proposed here, we believe that the sparsity of the data limits our ability to call more of these events. In the revision, we will provide a supplemental figure showing the relationship between read depth and p value for each gene to quantify how the fraction of observed regulation changes with sequencing depth. It is worth noting that as these technologies improve, we expect the sequencing depth of spatial technologies to increase which would likely result in more discoveries.

      The unique features for ST data, such as mixture of neighboring cells, different capture biases and much smaller number of spots (pseudo cells here), may have significant effects on the power of scRNA-seq based methods, but it is not discussed in the manuscript. The lack of careful evaluation and low discovery rates could limit application of the approach to other tissues and subcellular data.

      We appreciate the concern that technical differences between scRNA-seq data and spatial transcriptomics data could affect our results. We agree that this point could be addressed more thoroughly in the text. None of the specificities of spatial transcriptomics data invalidate the assumptions of the SpliZ or ReadZS. The method we use to identify genes with significant spatial regulation of RNA processing was specifically created to be used for Visium data. It takes into account mixture of RNAs in neighboring cells by randomly sampling scores of neighboring cells, rather than randomization of the location of the spots themselves, which does indeed result in a high false positive rate (see “Permutations for Moran’s I” in the Methods). We do note that there is a limit to the power of this kind of analysis based on the number of spots and the read depth, which we will quantify in a plot in the revision.

    2. eLife assessment:

      This important study describes spatial RNA processing events by combining methods for single-cell transcriptomics data with spatial transcriptomics data. The evidence supporting the claims of the authors is solid, although the analysis could be further strengthened by including a broader range of samples as well as orthogonal validation either by experimental methods or simulated data. The work will be of general interest to researchers in the spatial transcriptomics field as well as researchers investigating alternative pre-mRNA processing across diverse tissues.

    3. Reviewer #1 (Public Review):

      In this study, the authors set out to investigate spatial RNA processing events, specifically alternative splicing and 3' UTR usage, in mouse brain and kidney tissues using ReadZS and SpliZ methodologies on spatial transcriptomics data. The research contributes to understanding tissue-specific gene expression regulation from a spatial perspective. The study introduces a novel approach for analyzing spatial transcriptomics data, allowing for the identification of RNA processing and regulation patterns directly from 10X Visium data. The authors present convincing evidence supporting the identification of novel RNA processing patterns using their methodology, which holds significant implications for researchers in the field of spatial transcriptomics and the study of alternative splicing and 3' UTR usage

      The conclusions of the study are mostly well-supported by the data; however, certain aspects could be improved to strengthen the findings.<br /> 1) The conclusions of this study would be strengthened by conducting a more extensive tissue sample analysis and including biological replicates. Additionally, appropriate batch effect corrections should be applied when dealing with biological replicates.<br /> 2) The 3' UTR usage and alternative splicing should be compared among clearly labeled clusters for a more comprehensive analysis.<br /> 3) The authors should clarify their rationale for choosing ReadZS and SpliZ approaches and provide comparisons with other methods to demonstrate the advantages and potential limitations of their chosen methodologies.

    4. Reviewer #2 (Public Review):

      The authors applied existing ReadZS and the SpliZ methods, previously developed to analyze RNA process in scRNA-seq data, to Visium data to study spatial splicing and RNA processing events in tissues by Moran's I. The authors showed several example genes in mouse brain and kidney, whose processing are spatially regulated, such as Rps24, Myl6, Gng13.

      The paper touches on an important question in RNA biology about how RNA processing is regulated spatially. Both experimental and computational challenges remain to address it. Despite some potentially interesting findings, most claims remain to be validated by orthogonal methods such as RNA FISH and simulations. In addition, the percentage of spatial processing events (splicing in 0.8-2.2% of detected genes, i.e. 8-17 genes and RNA processing in 1.1-5.5% of detected genomic windows, i.e. 57-161 windows) discovered is low. Does it suggest that most of RNA processing events were not spatially regulated across the tissue? Or does it question the assumption of treating spatial transcriptomics data similar to scRNA-seq data? The unique features for ST data, such as mixture of neighboring cells, different capture biases and much smaller number of spots (pseudo cells here), may have significant effects on the power of scRNA-seq based methods, but it is not discussed in the manuscript. The lack of careful evaluation and low discovery rates could limit application of the approach to other tissues and subcellular data.

    1. eLife assessment

      Brain inflammation is a hallmark of multiple sclerosis. Using novel spatial transcriptomics methods, the authors provide convincing evidence for a gradient of immune genes and inflammatory markers from the meninges toward the adjacent brain parenchyma in a mouse model. This important study advances our understanding of the mechanisms of brain damage in this autoimmune disease.

    2. Reviewer #1 (Public Review):

      Multiple sclerosis (MS) is a debilitating autoimmune disease that causes loss of myelin in neurons of the central nervous system. MS is characterized by the presence of inflammatory immune cells in several brain regions as well as the brain barriers (meninges). This study aims to understand the local immune hallmarks in regions of the brain parenchyma that are adjacent to the leptomeninges in a mouse model of MS. The leptomeninges are known to be a foci of inflammation in MS and perhaps "bleed" inflammatory cells and molecules to adjacent brain parenchyma regions. To do so, they use novel technology called spatial transcriptomics so that the spatial relationships between the two regions remain intact. The study identifies canonical inflammatory genes and gene sets such as complement and B cells enriched in the parenchyma in close proximity to the leptomeninges in the mouse model of MS but not control. The manuscript is very well written and easy to follow. The results will become a useful resource to others working in the field and can be followed by time series experiments where the same technology can be applied to the different stages of the disease.

    3. Reviewer #2 (Public Review):

      Accumulating data suggests that the presence of immune cell infiltrates in the meninges of the multiple sclerosis brain contributes to the tissue damage in the underlying cortical grey matter by the release of inflammatory and cytotoxic factors that diffuse into the brain parenchyma. However, little is known about the identity and direct and indirect effects of these mediators at a molecular level. This study addresses the vital link between an adaptive immune response in the CSF space and the molecular mechanisms of tissue damage that drive clinical progression. In this short report the authors use a spatial transcriptomics approach using Visium Gene Expression technology from 10x Genomics, to identify gene expression signatures in the meninges and the underlying brain parenchyma, and their interrelationship, in the PLP-induced EAE model of MS in the SJL mouse. MRI imaging using a high field strength (11.7T) scanner was used to identify areas of meningeal infiltration for further study. They report, as might be expected, the upregulation of genes associated with the complement cascade, immune cell infiltration, antigen presentation, and astrocyte activation. Pathway analysis revealed the presence of TNF, JAK-STAT and NFkB signaling, amongst others, close to sites of meningeal inflammation in the EAE animals, although the spatial resolution is insufficient to indicate whether this is in the meninges, grey matter, or both.

      UMAP clustering illuminated a major distinct cluster of upregulated genes in the meninges and smaller clusters associated with the grey matter parenchyma underlying the infiltrates. The meningeal cluster contained genes associated with immune cell functions and interactions, cytokine production, and action. The parenchymal clusters included genes and pathways related to glial activation, but also adaptive/B-cell mediated immunity and antigen presentation. This again suggests a technical inability to resolve fully between the compartments as immune cells do not penetrate the pial surface in this model or in MS. Finally, a trajectory analysis based on distance from the meningeal gene cluster successfully demonstrated descending and ascending gradients of gene expression, in particular a decline in pathway enrichment for immune processes with distance from the meninges.

      Although these results confirm what we already know about processes involved in the meninges in MS and its models and gradients of pathology in sub-pial regions, this is the first to use spatial transcriptomics to demonstrate such gradients at a molecular level in an animal model that demonstrates lymphoid like tissue development in the meninges and associated grey matter pathology. The mouse EAE model being used here does reproduce many, although not all, of the pathological features of MS and the ability to look at longer time points has been exploited well. However, this particular spatial transcriptomics technique cannot resolve at a cellular level and therefore there is a lot of overlap between gene expression signatures in the meninges and the underlying grey matter parenchyma.

      The short nature of this report means that the results are presented and discussed in a vague way, without enough molecular detail to reveal much information about molecular pathogenetic mechanisms.

      The trajectory analysis is a good way to explore gradients within the tissues and the authors are to be applauded for using this approach. However, the trajectory analysis does not tell us much if you only choose 2 genes that you think might be involved in the pathogenetic processes going on in the grey matter. It might be more useful to choose some genes involved in pathogenetic processes that we already know are involved in the tissue damage in the underlying grey matter in MS, for which there is already a lot of literature, or genes that respond to molecules we know are increased in MS CSF, although the animal models may be very different. Why were C3 and B2m chosen here?

      Strengths:<br /> - The mouse model does exhibit many of the features of the compartmentalized immune response seen in MS, including the presence of meningeal immune cell infiltrates in the central sulcus and over the surface of the cortex, with the presence of FDC's HEVs PNAd+ vessels and CXCL13 expression, indicating the formation of lymphoid like cell aggregates. In addition, disruption of the glia limitans is seen, as in MS. Increased microglial reactivity is also present at the pial surface.<br /> - Spatial transcriptomics is the best approach to studying gradients in gene expression in both white matter and grey matter and their relationship between compartments.<br /> - It would be useful to have more discussion of how the upregulated pathways in the two compartments fit with what we know about the cellular changes occurring in both, for which presumably there is prior information from the group's previous publications.

      Limitations:<br /> - EAE in the mouse is not MS and may be far removed when one considers molecular mechanisms, especially as MS is not a simple anti-myelin protein autoimmune condition. Therefore, this study could be following gene trajectories that do not exist in MS. This needs a significant amount of discussion in the manuscript if the authors suggest that it is mimicking MS.<br /> - The model does not have the cortical subpial demyelination typical of MS and it is unknown whether neuronal loss occurs in this model, which is the main feature of cytokine-mediated neurodegeneration in MS. If it does not then a whole set of genes will be missing that are involved in the neuronal response to inflammatory stimuli that may be cytotoxic.<br /> - Visium technology does not get down to single cell level and does not appear to allow resolution of the border between the meninges and the underlying grey matter.<br /> - Neuronal loss in the MS cortex is independent of demyelination and therefore not related to remyelination failure. There does not appear to be any cortical grey matter demyelination in these animals, so it is difficult to relate any of the gene changes seen here to demyelination.<br /> - No mention of how the ascending and descending patterns of gene expression may be due to the gradient of microglial activation that underlies meningeal inflammation, which is a big omission.

    4. Reviewer #3 (Public Review):

      In this study, Gadani et al. induced EAE in SJL/J mice and performed a comprehensive spatial transcriptomic analysis in areas of meningeal inflammation during the relapse phase of the disease. The authors found specific enrichment in spatial gene signatures (cluster 11) in the regions of increased contrast-enhancement by MRI (where meningeal extravasation of activated immune cells is observed) that overlap with signatures in the adjacent brain parenchyma, namely the thalamus. Several pathways were similarly upregulated in the meningeal-associated cluster 11 and adjacent parenchymal clusters (like adaptive mediated immunity, and antigen processing and presentation), suggestive of a "leakage" of inflammatory mediators from the meninges into the brain during the re-activation of disease. The tested hypothesis, as well as the data presented in this study, is quite interesting and novel.

    5. Author Response:

      We thank Reviewer #1 for their positive assessment of our work.

      Reviewer #2 (Public Review):

      […] Although these results confirm what we already know about processes involved in the meninges in MS and its models and gradients of pathology in sub-pial regions, this is the first to use spatial transcriptomics to demonstrate such gradients at a molecular level in an animal model that demonstrates lymphoid like tissue development in the meninges and associated grey matter pathology. The mouse EAE model being used here does reproduce many, although not all, of the pathological features of MS and the ability to look at longer time points has been exploited well. However, this particular spatial transcriptomics technique cannot resolve at a cellular level and therefore there is a lot of overlap between gene expression signatures in the meninges and the underlying grey matter parenchyma.

      We appreciate the reviewer’s concise summary and comments on our manuscript. We agree that the Visium spatial sequencing technology we applied is limited in its resolution and cannot precisely distinguish individual cells or anatomic regions. For that reason, there is undoubtedly some overlap between gene expression signatures in the meninges and underlying parenchyma, particularly in spots on the borders of the meningeal inflammation clusters. However, we believe that the majority of meningeal inflammation (“cluster 11”) spots are indeed in the meninges and represent the spatial transcriptome of that niche. To support this, in the revised manuscript we will provide H&E images with the UMAP clusters overlayed to demonstrate the anatomic borders that correlate with the clusters.

      The short nature of this report means that the results are presented and discussed in a vague way, without enough molecular detail to reveal much information about molecular pathogenetic mechanisms.

      We thank the reviewer for this comment. The goal of this work is to transcriptomically characterize the spatial relationship between areas of meningeal inflammation and the underlying parenchyma. While we agree that mechanistic studies are needed to further evaluate the role of presented signaling pathways, those experiments are beyond the scope of this brief report.

      The trajectory analysis is a good way to explore gradients within the tissues and the authors are to be applauded for using this approach. However, the trajectory analysis does not tell us much if you only choose 2 genes that you think might be involved in the pathogenetic processes going on in the grey matter. It might be more useful to choose some genes involved in pathogenetic processes that we already know are involved in the tissue damage in the underlying grey matter in MS, for which there is already a lot of literature, or genes that respond to molecules we know are increased in MS CSF, although the animal models may be very different. Why were C3 and B2m chosen here?

      We appreciate the reviewer’s points here. C3 and B2m were chosen as examples of genes that have differential fit to the gradient descending pattern to assist the reader in interpreting subsequent gene set trajectory analysis. However, we agree that there are many other genes of interest and will expand the number of genes displayed in our revised manuscript. 

      Strengths: <br /> - The mouse model does exhibit many of the features of the compartmentalized immune response seen in MS, including the presence of meningeal immune cell infiltrates in the central sulcus and over the surface of the cortex, with the presence of FDC's HEVs PNAd+ vessels and CXCL13 expression, indicating the formation of lymphoid like cell aggregates. In addition, disruption of the glia limitans is seen, as in MS. Increased microglial reactivity is also present at the pial surface. <br /> - Spatial transcriptomics is the best approach to studying gradients in gene expression in both white matter and grey matter and their relationship between compartments. <br /> - It would be useful to have more discussion of how the upregulated pathways in the two .compartments fit with what we know about the cellular changes occurring in both, for which presumably there is prior information from the group's previous publications.

      Limitations: <br /> - EAE in the mouse is not MS and may be far removed when one considers molecular mechanisms, especially as MS is not a simple anti-myelin protein autoimmune condition. Therefore, this study could be following gene trajectories that do not exist in MS. This needs a significant amount of discussion in the manuscript if the authors suggest that it is mimicking MS. <br /> - The model does not have the cortical subpial demyelination typical of MS and it is unknown whether neuronal loss occurs in this model, which is the main feature of cytokine-mediated neurodegeneration in MS. If it does not then a whole set of genes will be missing that are involved in the neuronal response to inflammatory stimuli that may be cytotoxic. <br /> - Visium technology does not get down to single cell level and does not appear to allow resolution of the border between the meninges and the underlying grey matter. <br /> - Neuronal loss in the MS cortex is independent of demyelination and therefore not related to remyelination failure. There does not appear to be any cortical grey matter demyelination in these animals, so it is difficult to relate any of the gene changes seen here to demyelination. <br /> - No mention of how the ascending and descending patterns of gene expression may be due to the gradient of microglial activation that underlies meningeal inflammation, which is a big omission.

      We thank the reviewer for their insightful comments on the strengths and limitations of our study. Regarding the SJL EAE model we use in this paper, it certainly is not a perfect model of meningeal inflammation in MS, indeed we believe that no such animal model exists, but it does recapitulate several key features of human disease as described by the reviewer. Spatial transcriptomics of cortical grey matter lesions and overlying meninges of samples derived from patients with MS would be ideal, though access to this tissue is highly limited. In the revised manuscript we will include more detailed discussion of the limitations in applying these findings to MS. However, in addition to potential implications for MS research, our data contribute more generally to understanding of meningeal inflammation and penetrance of inflammation into brain tissue.

      We acknowledge that sub-pial neuronal loss has not been assessed in SJL EAE, and if present it would increase the relevance of this model to neurodegeneration. We are currently working to assess this.

      We agree with the reviewer that Visium technology is limited in its ability to discriminate individual cells, as discussed above (2.2).

      We agree that gene expression by activated microglia is likely a major driver of the transcriptomic changes observed in the parenchyma, and thank the reviewer for highlighting this. We will add discussion of this to our revised manuscript, and intend to generate additional data regarding the contribution of subpial microglial activation to the measured transcriptomic changes.

      Finally, we thank Reviewer #3 for their assessment of our work.

    1. Author Response

      eLife assessment:

      Trypanosoma brucei evades mammalian humoral immunity through the expression of different variant surface glycoprotein genes. In this fundamental paper, the authors extend previous observations that TbRAP1 both interacts with PIP5pase and binds PI(3,4,5)P3, indicating a role for PI(3,4,5)P3 binding and suggesting that antigen switching is signal dependent. While much of the evidence is compelling, one reviewer suggested that the work would benefit from further controls.

      We appreciate the evaluation of the work and agree that the findings substantially advance our understanding of antigenic variation. A detailed response to the public review is included below, which addresses and clarifies the issues raised by the reviewers, including those concerning controls. We also want to highlight the comment by Reviewer #3 “The methods used in the study are rigorous and well-controlled…. their results support the conclusions made in the manuscript.”. We hope this and our comments will help address the issue of controls in this eLife statement.

      Reviewer #1 (Public Review):

      Trypanosoma brucei undergoes antigenic variation to evade the mammalian host’s immune response. To achieve this, T. brucei regularly expresses different VSGs as its major surface antigen. VSG expression sites are exclusively subtelomeric, and VSG transcription by RNA polymerase I is strictly monoallelic. It has been shown that T. brucei RAP1, a telomeric protein, and the phosphoinositol pathway are essential for VSG monoallelic expression. In previous studies, Cestari et al. (ref. 24) have shown that PIP5pase interacts with RAP1 and that RAP1 binds PI(3,4,5)P3. RNAseq and ChIPseq analyses have been performed previously in PIP5pase conditional knockout cells, too (ref. 24). In the current study, Touray et al. did similar analyses except that catalytic dead PIP5pase mutant was used and the DNA and PI(3,4,5)P3 binding activities of RAP1 fragments were examined. Specifically, the authors examined the transcriptome profile and did RAP1 ChIPseq in PIP5pase catalytic dead mutant. The authors also expressed several C-terminal His6-tagged RAP1 recombinant proteins (full-length, aa1-300, aa301-560, and aa 561-855). These fragments’ DNA binding activities were examined by EMSA analysis and their phosphoinositides binding activities were examined by affinity pulldown of biotin-conjugated phosphoinositides. As a result, the authors confirmed that VSG silencing (both BES-linked and MES-linked VSGs) depends on PIP5pase catalytic activity, but the overall knowledge improvement is incremental. The most convincing data come from the phosphoinositide binding assay as it clearly shows that N-terminus of RAP1 binds PI(3,4,5)P3 but not PI(4,5)P2, although this is only assayed in vitro, while the in vivo binding of full-length RAP1 to PI(3,4,5)P3 has been previously published by Cestari et al (ref. 24) already. Considering that many phosphoinositides exert their regulatory role by modulating the subcellular localization of their bound proteins, it is reasonable to hypothesize that binding to PI(3,4,5)P3 can remove RAP1 from the chromatin. However, no convincing data have been shown to support the author’s hypothesis that this regulation is through an “allosteric switch”. Therefore, the title should be revised.

      We appreciate the reviewer’s detailed evaluation of our work. There are a few general comments that we would like to clarify. We will break them into three points. All data included here are new and were not previously published.

      i) “RNAseq and ChIPseq analyses have been performed previously …(ref. 24).” Reference 24 is Cestari et al. 2019, Mol Cell Biol. We, or others, have not published ChIP-seq of RAP1 in T. brucei. Previous work showed ChIP-qPCR, which analyses specific loci. The ChIP-seq shows genome-wide binding sites of RAP1, and new findings are shown here, including binding sites in the BES, MESs, and other genome loci such as centromeres. We also identified DNA sequence bias defining RAP1 binding sites (Fig 2A). We also show by ChIP-seq how RAP1-binding to these loci changes upon expression of catalytic inactive PIP5Pase. As for the RNA-seq, this is also the first time we show RNA-seq of T. brucei expressing catalytic inactive PIP5Pase, which establishes that the regulation of VSG silencing and switching is dependent on PIP5Pase enzyme catalysis, i.e., PI(3,4,5)P3 dephosphorylation. To improve clarity in the manuscript, we edited page 4, line 122, as follows: “We showed that RAP1 binds telomeric or 70 bp repeats (24), but it is unknown if it binds to other ES sequences or genomic loci.”

      ii) “The in vivo binding of full-length RAP1 to PI(3,4,5)P3 has been previously published by Cestari et al. (ref. 24) already.”. We published in reference 24 that RAP1-HA can bind agarose beads-conjugated synthetic PI(3,4,5)P3. Here, we were able to measure T. brucei endogenous PI(3,4,5)P3 associated with RAP1-HA (Fig 4F). Moreover, we showed that the endogenous RAP1-HA and PI(3,4,5)P3 binding is about 100-fold higher when PIP5Pase is catalytic inactive than WT PIP5Pase. The data establish that in vivo endogenous PI(3,4,5)P3 binds to RAP1-HA and how the binding changes in cells expressing mutant PIP5Pase; this data is new and relevant to our conclusions.

      iii) “no convincing data have been shown to support the author’s hypothesis that this regulation is through an “allosteric switch””. We show here in vitro and in vivo data supporting the conclusion. We show that PI(3,4,5)P3 binds to the N-terminus of rRAP1-His with a calculated Kd of about 20 µM (Fig 4B-E, Table 1). In contrast, we show by EMSA and binding kinetics by microscale thermophoresis that rRAP1-His binds to 70 bp and telomeric repeats via protein regions encompassing the Myb (central) or Myb-L domains (C-terminal) but not the N-terminus containing the VHP domain (Fig 3C-G, and Fig S5). Using microscale thermophoresis, we also show that rRAP1-His binds to 70 bp and telomeric repeats with Kd of 10 and 24 nM, respectively (Fig 3 and Table 1). Notably, we show that 30 µM of PI(3,4,5)P3, but not PI(4,5,)P2 – used as a control – disrupts rRAP1-His binding to 70 bp and telomeric repeats, changing Kds to about 188 and 155 nM, respectively (Fig 5A-C). We also show that PI(3,4,5)P3 does not disrupt the binding of rRAP1-His fragments (Myb or MybL) without the N-terminus domain (Fig S5), implying binding of PI(3,4,5)P3 to RAP1 N-terminus is required for displacement of RAP1 DNA binding domains (Myb and MybL) from telomeric and 70 bp repeats, and that PI(3,4,5)P3 is not competing for Myb or Myb-L binding to DNA. Moreover, we show that RAP1-HA binding to 70 bp and telomeric repeats in vivo is displaced in T. brucei cells expressing catalytic inactive PIP5Pase (Fig 5D-G), which we show results in RAP1-HA binding about 100-fold more endogenous PI(3,4,5)P3 than in T. brucei expressing WT PIP5Pase (Fig 4F). The in vivo data agrees with the in vitro data. The data show a typical allosteric regulator system, in which binding of a ligand to one site of the protein, here PI(3,4,5)P3 binding to RAP1 N-terminus, affects other domains (RAP1 Myb and Myb-L domains) binding to DNA. To improve the clarity of the title, we will change it in the revised version to imply a direct role of PI(3,4,5)P3 regulation of RAP1 in the process. This will provide more specific information to the readers and addresses the concern of the reviewer related to the “allosteric switch”. The new title will be: PI(3,4,5)P3 allosteric regulation of RAP1 controls antigenic switching in trypanosomes

      There are serious concerns about many conclusions made by Touray et al., according to their experimental approaches:

      1) The authors have been studying RAP1’s chromatin association pattern by ChIPseq in cells expressing a C-terminal HA tagged RAP1. According to data from tryptag.org, RAP1 with an N-terminal or a C-terminal tag does not seem to have identical subcellular localization patterns, suggesting that adding tags at different positions of RAP1 may affect its function. It is therefore essential to validate that the C-terminally HA-tagged RAP1 still has its essential functions. However, this data is not available in the current study. RAP1 is essential. If RAP1-HA still retains its essential functions, cells carrying one RAP1-HA allele and one deleted allele are expected to grow the same as WT cells. In addition, these cells should have the WT VSG expression pattern, and RAP1-HA should still interact with TRF. Without these validations, it is impossible to judge whether the ChIPseq data obtained on RAP1-HA reflect the true chromatin association profile of RAP1.

      Tryptag data show both N- and C-terminus RAP1 with nuclear localization in procyclic forms, although there are differences in signal intensities in the images (http://tryptag.org/?id=Tb927.11.370). It is important to note that Tryptag data is from procyclic forms, and DNA constructs are not validated for their integration in the correct locus. As for the RAP1-HA localization in bloodstream forms, we demonstrated that C-terminally HA-tagged RAP1 co-localizes with telomeres by a combination of immunofluorescence and fluorescence in situ hybridization (Cestari and Stuart, 2015, PNAS), and RAP1-HA co-immunoprecipitate telomeric and 70 bp repeats (Cestari et al. 2019 Mol Cell Biol). We also showed by immunoprecipitation and mass spectrometry that HA-tagged RAP1 interacts with nuclear and telomeric proteins, including PIP5Pase (Cestari et al. 2019). Others have also tagged T. brucei RAP1 in bloodstream forms with HA without disrupting its nuclear localization (Yang et al. 2009, Cell; Afrin et al. 2020, Science Advances). As for the experiment suggested by the reviewer, there is no guarantee that cells lacking one allele of RAP1 will behave as wildtype, i.e., normal growth and repression of VSGs genes. Also, less than 90% of T. brucei TRF was reported to interact with RAP1 (Yang et al. 2009, Cell), which might be indirect via their binding to telomeric DNA repeats rather than direct protein-protein interactions.

      2) Touray et al. expressed and purified His6-tagged recombinant RAP1 fragments from E. coli and used these recombinant proteins for EMSA analysis: The His6 tag has been used for purifying various recombinant proteins. It is most likely that the His6 tag itself does not convey any DNA binding activities. However, using His6-tagged RAP1 fragments for EMSA analysis has a serious concern. It has been shown that His6-tagged human RAP1 protein can bind dsDNA, but hRAP1 without the His6 tag does not. It is possible that RAP1 proteins in combination with the His6 tag can exhibit certain unnatural DNA binding activities. To be rigorous, the authors need to remove the His6 tag from their recombinant proteins before the in vitro DNA binding analyses are performed. This is a standard procedure for many in vitro assays using recombinant proteins.

      We show in Fig 3C-G that His-tagged full-length rRAP1 does not bind to scrambled telomeric dsDNA sequences, which indicates that His-tagged rRAP1 does not bind unspecifically to DNA. Moreover, in Fig 3G, we show that His-tagged rRAP11-300 also does not bind to 70 bp or telomeric repeats. In contrast, full-length His-tagged rRAP1, rRAP1301-560, or rRAP1561-855 bind to 70 bp or telomeric repeats (Fig 3C-G). Since all proteins were His-tagged, the His tag cannot be responsible for the DNA binding.

      As for the statement that human rRAP1-His has unspecific DNA binding properties, we could not find a reference to this statement; we cannot compare it without knowing the details of the experiment. Biochemical assays can result in unspecific binding depending on binding/buffer conditions. Also, humans and T. brucei RAP1 share only 15% of amino acid identity; unspecific binding to DNA could be specific to human RAP1.

      3) It is unclear why Nanopore sequencing was used for RNAseq and ChIPseq experiments. The greatest benefit of Nanopore sequencing is that it can sequence long reads, which usually helps with mapping, particularly at genome loci with repetitive sequences. This seems beneficial for RAP1 ChIPseq analysis as RAP1 is expected to bind telomere repeats. However, for ChIPseq, the chromatin needs to be fragmented. Larger DNA fragments from ChIPseq experiments will decrease the accuracy of the final calculated binding sites. Therefore, ChIPseq experiments are not supposed to have long reads to start with, so Nanopore sequencing does not seem to bring any advantage. In addition, compared to Illumina sequencing, Nanopore sequencing usually yields smaller numbers of reads, and the sequencing accuracy rate is lower. The Nanopore sequencing accuracy may be a serious concern in the current study. All telomeres have the perfect TTAGGG repeats, all VSG genes have a very similar 3’ UTR, and all 70 bp repeats have very similar sequences. In fact, the active and silent ESs have 90% sequence identity. Are sequence reads accurately mapped to different ESs? How is the sequencing and mapping quality controlled? Furthermore, it is unclear whether the read depth for RNAseq is deep enough.

      The mean sequence length for the ChIP-seq was about 500 bp (see Table S3), which helps to align reads to ESs and distinguish the different ESs, and it is a reasonable size range to define RAP1 binding sites. Although sequencing depths are usually higher in Illumina than in nanopore (all depending on the amount of sequencing), most Illumina short reads map to multiple genomic sequences, making it difficult to distinguish ESs. This is particularly important for RAP1 because it binds to repeats such as 70 bp and telomeric repeats. Mapping short reads to those regions would be virtually impossible; hence, our choice of nanopore sequencing. For RNA-seq, the ~500 bp read length help sequence alignment to the subtelomeric regions containing many VSG genes. The nanopore reads obtained here had an average sequencing score 12 (i.e., base call accuracy of 94%). Filtering reads with MAPQ ≥ 20 (99% probability of correct alignment) helped us to distinguish RAP1 binding to specific ESs, including silent vs active ES (ChIP-seq) or VSG sequences (RNA-seq). The details of the analysis and sequencing metrics (i.e., sequencing depth and read length) were described in the Methods section “Computational analysis of RNA-seq and ChIP-seq” and Table S3, respectively.

      4) Many statements in the discussion section are speculations without any solid evidence. For example, lines 218 - 219 “likely due to RAP1 conformational changes”, no data have been shown to support this at all. In lines 224-226, the authors acknowledged that more experiments are necessary to validate their observations, so it is important for the authors to first validate their findings before they draw any solid conclusions. Importantly, RAP1 has been shown to help compact telomeric and subtelomeric chromatin a long time ago by Pandya et al. (2013. NAR 41:7673), who actually examined the chromatin structure by MNase digestion and FAIRE. The authors should acknowledge previous findings. In addition, the authors need to revise the discussion to clearly indicate what they “speculate” rather than make statements as if it is a solid conclusion.

      The statement “likely due to RAP1 conformational changes” in lines 218-219 (page 6) is part of the Discussion. We did not make a strong statement but discussed a possibility. We believe that it is beneficial to the reader to have the data discussed, and we do not feel this point is overly speculative.

      For lines 224-226 (page 6), the statement refers to the finding of RAP1 binding to centromeric regions by ChIP-seq, which is a new finding but not the focus of this work. Hence, future studies are necessary for this finding, and we believe it is appropriate in the Discussion to be upfront and highlight this point to the readers. However, for the RAP1 binding to telomeric ES sites, e.g., 70 bp repeats and telomeric repeats (the focus of this work), we validated the binding by EMSA and by performing binding kinetics using microscale thermophoresis.

      We did not include Pandya et al. 2013 NAR because the authors demonstrated RAP1 compaction of chromatin to occur in procyclic forms only. Pandya et al. stated in their abstract: “no significant chromatin structure changes were detected on depletion of TbRAP1 in BF cells”. Hence, the suggested reference is not relevant to the context of our conclusions in bloodstream forms. Nevertheless, we have reviewed the Discussion to avoid broad speculations in the revised version of the manuscript.

      There are also minor concerns:

      1) In the PIP5Pase conditional knockout system, the WT or mutant PIP5Pase with a V5 tag is constitutively expressed from the tubulin array. What’s the relative expression level of this allele and the endogenous PIP5Pase? Without a clear knowledge of the mutant expression level, it is hard to conclude whether the mutant has any dominant negative effects or whether the mutant phenotype is simply due to a lower than WT PIP5pase expression level.

      The relative mRNA levels of the exclusive expression of PIP5Pase Mut compared to the WT is available in the Data S1, RNA-seq. The Mut allele’s relative expression level is 0.85-fold to the WT allele (both from tubulin loci). We also showed by Western blot the WT and Mut PIP5Pase protein expression (Cestari et al. 2019, Mol Cell Biol). Concerning PIP5Pase endogenous alleles, we compared RNA-seq reads counts per million from the conditional null PIP5Pase cells exclusively expressing WT or the Mut PIP5Pase alleles (Data S1, this work) to our previous RNA-seq of single-marker 427 strain (Cestari et al. 2019, Mol Cell Biol). We used the single-maker 427 because the conditional null cells were generated in this strain background. The PIP5Pase WT and Mut mRNAs expressed from tubulin loci are 1.6 and 1.3-fold the endogenous PIP5Pase levels in single-marker 427, respectively. We include a statement in the Methods, page 7, lines 265-268: “The WT or Mut PIP5Pase mRNAs exclusively expressed from tubulin loci are 1.6 and 1.3-fold the WT PIP5Pase mRNA levels expressed from endogenous alleles in the single marker 427 strain. The fold-changes were calculated from RNA-seq reads counts per million from this work (WT and Mut PIP5Pase, Data S1) and our previous RNA-seq from single marker 427 strain (24).”

      2) In EMSA analysis, what are the concentrations of the protein and the probe used in each reaction? The amount of protein used in the binding assay appears to be very high, and this can contribute to the observation that many complexes are stuck in the well. Better quality EMSA data need to be shown to support the authors’ claims.

      All concentrations were provided in the Methods section. See page 9 Electrophoretic mobility shift assays: “100 nM of annealed DNA were mixed with 1 μg of recombinant protein…”. For microscale thermophoresis, also see page 9, Microscale thermophoresis binding kinetics: “1 μM rRAP1 was diluted in 16 two-fold serial dilutions in 250 mM HEPES pH 7.4, 25 mM MgCl2, 500 mM NaCl, and 0.25% (v/v) N P-40 and incubated with 20 nM telomeric or 70 bp repeats…”. Note that two different biochemical approaches, EMSA and microscale thermophoresis, were used to assess rRAP1-His binding to DNA. Both show similar results (Fig 3 and 5, and Fig S5; microscale thermophoresis shows the binding kinetics, data available in Table 1). The EMSA images clearly show the binding of RAP1 to 70 bp or telomeric repeats but not to scramble telomeric repeat DNA.

      Reviewer #2 (Public Review):

      This manuscript by Touray, et al. provides a significant new twist to our understanding of how antigenic variation may be regulated in T. brucei. Key aspects of antigenic variation are the mutually exclusive expression of a single antigen per cell and the periodic switching from expression of one antigen isoform to another. In this manuscript, the authors show, as they have previously shown, that depletion of the nuclear phosphatidylinositol 5-phosphatase (PIP5Pase) results in a loss of mutually exclusive VSG expression. Furthermore, using ChIP-seq, the authors show that the repressor/activator protein 1 (RAP1) binds to regions upstream and downstream of VSG genes located in transcriptionally repressed expression sites and that this binding is lost in the absence of a functional PIP5Pase. Importantly, the authors decided to further investigate this link between PIP5Pase and RAP1, a protein that has previously been implicated in antigenic variation in T. brucei, and found that inactivation of PIP5Pase results in the accumulation of PI(3,4,5)P3 bound to the RAP1 N-terminus and that this binding impairs the ability of RAP1 to bind DNA. Based on these observations, the authors suggest that the levels of PI(3,4,5)P3 may determine the cellular function of RAP1, either by binding upstream of VSG genes and repressing their function, or by not binding DNA and allowing the simultaneous expression of multiple VSG genes in a single parasite.

      While I find most of the data presented in this manuscript compelling, there are aspects of Figure 1 that are not clear to me. Based on Figure 1F, the authors claim that transient inactivation of PIP5Pase results in a switch from the expression of one VSG isoform to another. However, I am not exactly sure what the authors are showing in this panel, nor do the data in Figure 1F seem to be consistent with those shown in Figure 1C. Based on Figure 1F, a transient inactivation of PIP5Pase appears to result in an almost exclusive switch to a VSG located in BES12. However, based on Figure 1E, the VSG transcripts most commonly found after a transient inactivation of PIP5Pase are those from the previously active VSG (BES1) and VSGs located on chr 1 and 6 (I believe). The small font and the low resolution make it impossible to infer the location of the expressed VSG genes, nor to confirm that ALL VSG genes located in expression sites are activated, as the authors claim. Also, I was not able to access the raw ChIP-seq and RNA-seq reads. Thus, could not evaluate the quality of the sequencing data.

      We appreciate the reviewer’s comments and evaluation of our work. Fig 1E shows VSG-seq of a population after transient (24h) exclusive expression of the PIP5Pase mutant, followed by re-expression of the WT PIP5Pase allele for 60 hours (multiple VSGs are detected). As a control, it also shows VSG-seq in cells continuously expressing WT PIP5Pase (mostly VSG2, BES1 is detected). Fig 1F and Fig S1 show the sequencing of VSGs expressed by clones isolated (5-6 days of growth) after a temporary knockdown (24h) of PIP5Pase (tet -), followed by its re-expression. For comparison, no knockdown (tet +) was included. Fig 1F shows potential switchers in the population, the Fig 1E confirms VSG switching in clones.

      To clarify the difference between Fig 1E and 1F, we edited the manuscript on page 3, lines 103-110: “To verify PIP5Pase role in VSG switching, we knocked down PIP5Pase for 24h (Tet -), then restored its expression (Tet +) and isolated clones by limiting dilution and growth for 5-6 days. Analysis of isolated clones after temporary PIP5Pase knockdown (Tet -/+) confirmed VSG switching in 93 out of 94 (99%) of the analyzed clones (Fig 1F, Fig S1). The cells switched to express VSGs from silent ESs or subtelomeric regions, indicating switching by transcription or recombination mechanisms. Moreover, no switching was detected in 118 isolated clones from cells continuously expressing WT PIP5Pase (Tet +, Fig 1F).”. We also edited Fig 1F to indicate temporary knockdown (Tet -/+) vs no knockdown (Tet -). The modifications will be available in the resubmitted version of the manuscript.

      We agree that the heat map is difficult to read due to the amount of information. We will include in the revised version of the manuscript a table with the data in the supplementary information; the reader will be able to evaluate the data in detail.

      A preference for switching to specific ESs has been observed in T. brucei (Morrison et al. 2005, Int J Parasitol; Cestari and Stuart, 2015, PNAS), which may explain several clones switching to BES12. Many potential switchers were detected in the VSG-seq (Fig 1F, the whole cell population is over 107 parasites), but not all potential switchers were detected in the clonal analysis because we analyzed 212 clones total, a fraction of the over 107 cells analyzed by VSG-seq (Fig 1E). Also, it is possible that not all potential switchers are viable. However, the point of the clonal analysis is to validate the VSG switching after genetic perturbation of PIP5Pase.

      Fig 1C shows examples of ES derepression by RNA-seq after 24h exclusive expression of the mutant compared to WT PIP5Pase. The RNA-seq shows that all ESs are derepressed (Fig 1B). This can be visualized in the volcano plot (Fig 1B, BES and MES VSGs are labelled) and on the spreadsheet Data S1. Although all ESs are derepressed after PIP5Pase mutant expression, not all ESs are selected during switching, as observed in Fig 1E-F. This agrees with our previous observations in switching assays with proteins that control VSG switching (Cestari and Stuart, 2015, PNAS).

      As for metrics of sequencing and raw sequencing data. See Methods section, page 13, lines 483-485: “Sequencing information is available in Table S3 and fastq data is available in the Sequence Read Archive (SRA) with the BioProject identification PRJNA934938.” Table S3 has a summary of sequencing data. Metrics information such as sequencing quality and analysis can be found in the Methods section “Computational analysis of RNA-seq and ChIP-seq”. The latter includes information about nanopore reads, i.e., mean Q-score of 12.

      Reviewer #3 (Public Review):

      In this manuscript, Touray et al investigate the mechanisms by which PIP5Pase and RAP1 control VSG expression in T. brucei and demonstrate an important role for this enzyme in a signalling pathway that likely plays a role in antigenic variation in T. brucei.

      The methods used in the study are rigorous and well-controlled. The authors convincingly demonstrate that RAP1 binds to PI(3,4,5)P3 through its N-terminus and that this binding regulates RAP1 binding to VSG expression sites, which in turn regulates VSG silencing. Overall their results support the conclusions made in the manuscript.

      There are a few small caveats that are worth noting. First, the analysis of VSG derepression and switching in Figure 1 relies on a genome that does not contain minichromosomal (MC) VSG sequences. This means that MC VSGs could theoretically be misassigned as coming from another genomic location in the absence of an MC reference. As the origin of the VSGs in these clones isn’t a major point in the paper, I do not think this is a major concern, but I would not over-interpret the particular details of switching outcomes in these experiments.

      The authors state that “our data imply that antigenic variation is not exclusively stochastic.” I am not sure this is true. While I also favor the idea that switching is not exclusively stochastic, evidence for a signaling pathway does not necessarily imply that antigenic variation is not stochastic. This pathway could be important solely for lifecycle-related control of VSG expression, rather than antigenic variation during infection. Nevertheless, these data are critical for establishing a potential pathway that could control antigenic variation and thus represent a fundamental discovery.

      Another aspect of this work that is perhaps important, but not discussed much by the authors, is the fact that signalling is extremely poorly understood in T. brucei. In Figure 1B, the RNA-seq data show many genes upregulated after expression of the Mut PIP5Pase (not just VSGs). The authors rightly avoid claiming that this pathway is exclusive to VSGs, but I wonder if these data could provide insight into the other biological processes that might be controlled by this signaling pathway in T. brucei.

      Overall, this is an excellent study that represents an important step forward in understanding how antigenic variation is controlled in T. brucei. The possibility that this process could be controlled via a signalling pathway has been speculated for a long time, and this study provides the first mechanistic evidence for that possibility.

      We thank the reviewer for the evaluation of our work. We agree that it is difficult to ensure the origin of all VSG genes not having minichromosome sequences; hence we did not emphasize this point in the manuscript. We used the 427-2018 reference genome assembled by PacBio and Hi-C (Muller et al. 2018, Nature), which we believe is the best assembly for the 427 strain, especially related to the VSG genes.

      We also agree that having signaling controlling switching in vitro does not mean the switching necessarily occurs by signaling in vivo. Nevertheless, stochastic switching is an accepted model; but it has not been proved, whereas we provide molecular evidence that signaling can cause switching. To express this reviewer’s suggestion, we edited the Discussion, page 7, line 250: from “our data imply that antigenic variation is not exclusively stochastic” to “our data suggest that antigenic variation is not exclusively stochastic”.

      Most of the RNA-seq data were VSGs genes/pseudogenes. Other genes upregulated included retrotransposons and DNA/RNA processing enzymes such as endonucleases and polymerases. We included in the Results, page 3, line 100: “Other genes upregulated include primarily retrotransposons, endonucleases, and polymerase proteins.”.

    2. eLife assessment:

      Trypanosoma brucei evades mammalian humoral immunity through the expression of different variant surface glycoprotein genes. In this fundamental paper, the authors extend previous observations that TbRAP1 both interacts with PIP5Pase and binds PI(3,4,5)P3, indicating a role for PI(3,4,5)P3 binding and suggesting that antigen switching is signal dependent. While much of the evidence is compelling, one reviewer suggested that the work would benefit from further controls.

    3. Reviewer #1 (Public Review):

      Trypanosoma brucei undergoes antigenic variation to evade the mammalian host's immune response. To achieve this, T. brucei regularly expresses different VSGs as its major surface antigen. VSG expression sites are exclusively subtelomeric, and VSG transcription by RNA polymerase I is strictly monoallelic. It has been shown that T. brucei RAP1, a telomeric protein, and the phosphoinositol pathway are essential for VSG monoallelic expression. In previous studies, Cestari et al. (ref. 24) have shown that PIP5Pase interacts with RAP1 and that RAP1 binds PI(3,4,5)P3. RNAseq and ChIPseq analyses have been performed previously in PIP5Pase conditional knockout cells, too (ref. 24). In the current study, Touray et al. did similar analyses except that catalytic dead PIP5Pase mutant was used and the DNA and PI(3,4,5)P3 binding activities of RAP1 fragments were examined. Specifically, the authors examined the transcriptome profile and did RAP1 ChIPseq in PIP5Pase catalytic dead mutant. The authors also expressed several C-terminal His6-tagged RAP1 recombinant proteins (full-length, aa1-300, aa301-560, and aa 561-855). These fragments' DNA binding activities were examined by EMSA analysis and their phosphoinositides binding activities were examined by affinity pulldown of biotin-conjugated phosphoinositides. As a result, the authors confirmed that VSG silencing (both BES-linked and MES-linked VSGs) depends on PIP5Pase catalytic activity, but the overall knowledge improvement is incremental. The most convincing data come from the phosphoinositide binding assay as it clearly shows that N-terminus of RAP1 binds PI(3,4,5)P3 but not PI(4,5)P2, although this is only assayed in vitro, while the in vivo binding of full-length RAP1 to PI(3,4,5)P3 has been previously published by Cestari et al (ref. 24) already. Considering that many phosphoinositides exert their regulatory role by modulating the subcellular localization of their bound proteins, it is reasonable to hypothesize that binding to PI(3,4,5)P3 can remove RAP1 from the chromatin. However, no convincing data have been shown to support the author's hypothesis that this regulation is through an "allosteric switch". Therefore, the title should be revised.

      There are serious concerns about many conclusions made by Touray et al., according to their experimental approaches:<br /> 1. The authors have been studying RAP1's chromatin association pattern by ChIPseq in cells expressing a C-terminal HA tagged RAP1. According to data from tryptag.org, RAP1 with an N-terminal or a C-terminal tag does not seem to have identical subcellular localization patterns, suggesting that adding tags at different positions of RAP1 may affect its function. It is therefore essential to validate that the C-terminally HA-tagged RAP1 still has its essential functions. However, this data is not available in the current study. RAP1 is essential. If RAP1-HA still retains its essential functions, cells carrying one RAP1-HA allele and one deleted allele are expected to grow the same as WT cells. In addition, these cells should have the WT VSG expression pattern, and RAP1-HA should still interact with TRF. Without these validations, it is impossible to judge whether the ChIPseq data obtained on RAP1-HA reflect the true chromatin association profile of RAP1.

      2. Touray et al. expressed and purified His6-tagged recombinant RAP1 fragments from E. coli and used these recombinant proteins for EMSA analysis: The His6 tag has been used for purifying various recombinant proteins. It is most likely that the His6 tag itself does not convey any DNA binding activities. However, using His6-tagged RAP1 fragments for EMSA analysis has a serious concern. It has been shown that His6-tagged human RAP1 protein can bind dsDNA, but hRAP1 without the His6 tag does not. It is possible that RAP1 proteins in combination with the His6 tag can exhibit certain unnatural DNA binding activities. To be rigorous, the authors need to remove the His6 tag from their recombinant proteins before the in vitro DNA binding analyses are performed. This is a standard procedure for many in vitro assays using recombinant proteins.

      3. It is unclear why Nanopore sequencing was used for RNAseq and ChIPseq experiments. The greatest benefit of Nanopore sequencing is that it can sequence long reads, which usually helps with mapping, particularly at genome loci with repetitive sequences. This seems beneficial for RAP1 ChIPseq analysis as RAP1 is expected to bind telomere repeats. However, for ChIPseq, the chromatin needs to be fragmented. Larger DNA fragments from ChIPseq experiments will decrease the accuracy of the final calculated binding sites. Therefore, ChIPseq experiments are not supposed to have long reads to start with, so Nanopore sequencing does not seem to bring any advantage. In addition, compared to Illumina sequencing, Nanopore sequencing usually yields smaller numbers of reads, and the sequencing accuracy rate is lower. The Nanopore sequencing accuracy may be a serious concern in the current study. All telomeres have the perfect TTAGGG repeats, all VSG genes have a very similar 3' UTR, and all 70 bp repeats have very similar sequences. In fact, the active and silent ESs have 90% sequence identity. Are sequence reads accurately mapped to different ESs? How is the sequencing and mapping quality controlled? Furthermore, it is unclear whether the read depth for RNAseq is deep enough.

      4. Many statements in the discussion section are speculations without any solid evidence. For example, lines 218 - 219 "likely due to RAP1 conformational changes", no data have been shown to support this at all. In lines 224-226, the authors acknowledged that more experiments are necessary to validate their observations, so it is important for the authors to first validate their findings before they draw any solid conclusions. Importantly, RAP1 has been shown to help compact telomeric and subtelomeric chromatin a long time ago by Pandya et al. (2013. NAR 41:7673), who actually examined the chromatin structure by MNase digestion and FAIRE. The authors should acknowledge previous findings. In addition, the authors need to revise the discussion to clearly indicate what they "speculate" rather than make statements as if it is a solid conclusion.

      There are also minor concerns:

      1. In the PIP5Pase conditional knockout system, the WT or mutant PIP5Pase with a V5 tag is constitutively expressed from the tubulin array. What's the relative expression level of this allele and the endogenous PIP5Pase? Without a clear knowledge of the mutant expression level, it is hard to conclude whether the mutant has any dominant negative effects or whether the mutant phenotype is simply due to a lower than WT PIP5pase expression level.

      2. In EMSA analysis, what are the concentrations of the protein and the probe used in each reaction? The amount of protein used in the binding assay appears to be very high, and this can contribute to the observation that many complexes are stuck in the well. Better quality EMSA data need to be shown to support the authors' claims.

    4. Reviewer #2 (Public Review):

      This manuscript by Touray, et al. provides a significant new twist to our understanding of how antigenic variation may be regulated in T. brucei. Key aspects of antigenic variation are the mutually exclusive expression of a single antigen per cell and the periodic switching from expression of one antigen isoform to another. In this manuscript, the authors show, as they have previously shown, that depletion of the nuclear phosphatidylinositol 5-phosphatase (PIP5Pase) results in a loss of mutually exclusive VSG expression. Furthermore, using ChIP-seq, the authors show that the repressor/activator protein 1 (RAP1) binds to regions upstream and downstream of VSG genes located in transcriptionally repressed expression sites and that this binding is lost in the absence of a functional PIP5Pase. Importantly, the authors decided to further investigate this link between PIP5Pase and RAP1, a protein that has previously been implicated in antigenic variation in T. brucei, and found that inactivation of PIP5Pase results in the accumulation of PI(3,4,5)P3 bound to the RAP1 N-terminus and that this binding impairs the ability of RAP1 to bind DNA. Based on these observations, the authors suggest that the levels of PI(3,4,5)P3 may determine the cellular function of RAP1, either by binding upstream of VSG genes and repressing their function, or by not binding DNA and allowing the simultaneous expression of multiple VSG genes in a single parasite.

      While I find most of the data presented in this manuscript compelling, there are aspects of Figure 1 that are not clear to me. Based on Figure 1F, the authors claim that transient inactivation of PIP5Pase results in a switch from the expression of one VSG isoform to another. However, I am not exactly sure what the authors are showing in this panel, nor do the data in Figure 1F seem to be consistent with those shown in Figure 1C. Based on Figure 1F, a transient inactivation of PIP5Pase appears to result in an almost exclusive switch to a VSG located in BES12. However, based on Figure 1E, the VSG transcripts most commonly found after a transient inactivation of PIP5Pase are those from the previously active VSG (BES1) and VSGs located on chr 1 and 6 (I believe). The small font and the low resolution make it impossible to infer the location of the expressed VSG genes, nor to confirm that ALL VSG genes located in expression sites are activated, as the authors claim. Also, I was not able to access the raw ChIP-seq and RNA-seq reads. Thus, could not evaluate the quality of the sequencing data.

    5. Reviewer #3 (Public Review):

      In this manuscript, Touray et al investigate the mechanisms by which PIP5Pase and RAP1 control VSG expression in T. brucei and demonstrate an important role for this enzyme in a signalling pathway that likely plays a role in antigenic variation in T. brucei.

      The methods used in the study are rigorous and well-controlled. The authors convincingly demonstrate that RAP1 binds to PI(3,4,5)P3 through its N-terminus and that this binding regulates RAP1 binding to VSG expression sites, which in turn regulates VSG silencing. Overall their results support the conclusions made in the manuscript.

      There are a few small caveats that are worth noting. First, the analysis of VSG derepression and switching in Figure 1 relies on a genome that does not contain minichromosomal (MC) VSG sequences. This means that MC VSGs could theoretically be misassigned as coming from another genomic location in the absence of an MC reference. As the origin of the VSGs in these clones isn't a major point in the paper, I do not think this is a major concern, but I would not over-interpret the particular details of switching outcomes in these experiments.

      The authors state that "our data imply that antigenic variation is not exclusively stochastic." I am not sure this is true. While I also favor the idea that switching is not exclusively stochastic, evidence for a signaling pathway does not necessarily imply that antigenic variation is not stochastic. This pathway could be important solely for lifecycle-related control of VSG expression, rather than antigenic variation during infection. Nevertheless, these data are critical for establishing a potential pathway that could control antigenic variation and thus represent a fundamental discovery.

      Another aspect of this work that is perhaps important, but not discussed much by the authors, is the fact that signalling is extremely poorly understood in T. brucei. In Figure 1B, the RNA-seq data show many genes upregulated after expression of the Mut PIP5Pase (not just VSGs). The authors rightly avoid claiming that this pathway is exclusive to VSGs, but I wonder if these data could provide insight into the other biological processes that might be controlled by this signaling pathway in T. brucei.

      Overall, this is an excellent study that represents an important step forward in understanding how antigenic variation is controlled in T. brucei. The possibility that this process could be controlled via a signalling pathway has been speculated for a long time, and this study provides the first mechanistic evidence for that possibility.

    1. eLife assessment

      This research advance article describes a valuable image analysis method to identify individual cells (neurons) within a ‎population of fluorescently labeled cells in the nematode C. elegans. The findings are solid and the method succeeds to identify cells with high precision. The method will be valuable to the C. elegans research community.

    2. Reviewer #1 (Public Review):

      In this paper, the authors developed an image analysis pipeline to automatically identify individual ‎neurons within a population of fluorescently tagged neurons. This application is optimized to deal with ‎multi-cell analysis and builds on a previous software version, developed by the same team, to resolve ‎individual neurons from whole-brain imaging stacks. Using advanced statistical approaches and ‎several heuristics tailored for C. elegans anatomy, the method successfully identifies individual ‎neurons with a fairly high accuracy. Thus, while specific to C. elegans, this method can become ‎instrumental for a variety of research directions such as in-vivo single-cell gene expression analysis ‎and calcium-based neural activity studies.‎

      The analysis procedure depends on the availability of an accurate atlas that serves as a reference map ‎for neural positions. Thus, when imaging a new reporter line without fair prior knowledge of the ‎tagged cells, such an atlas may be very difficult to construct. Moreover, usage of available reference ‎atlases, constructed based on other databases, is not very helpful (as shown by the authors in Fig 3), ‎so for each new reporter line a de-novo atlas needs to be constructed.‎

      I have a few comments that may help to better understand the potential of the tool to become handy:

      ‎1) I wonder the degree by which strain mosaicism affects the analysis (Figs 1-4) as it was performed on ‎a non-integrated reporter strain. As stated, for constructing the reference atlas, the authors used ‎worms in which they could identify the complete set of tagged neurons. But how sensitive is the ‎analysis when assaying worms with different levels of mosaicism? Are the results shown in the paper ‎stem from animals with a full neural set expression? Could the authors add results for which the ‎assayed worms show partial expression where only 80%, 70%, 50% of the cells population are ‎observed, and how this will affect identification accuracy? This may be important as many non-‎integrated reporter lines show high mosaic patterns and may therefore not be suitable for using this ‎analytic method. In that sense, could the authors describe the mosaic degree of their line used for ‎validating the method.‎<br /> ‎<br /> 2) For the gene expression analysis (Fig 5), where was the intensity of the GFP extracted from? As it has ‎no nuclear tag, the protein should be cytoplasmic (as seen in Fig 5a), but in Fig 5c it is shown as if the ‎region of interest to extract fluorescence was nuclear. If fluorescence was indeed extracted from the ‎cytoplasm, then it will be helpful to include in the software and in the results description how this was ‎done, as a huge hurdle in dissecting such multi-cell images is avoiding crossreads between ‎adjacent/intersecting neurons.‎<br /> ‎<br /> 3) In the same matter: In the methods, it is specified that the strain expressing GCAMP was also used ‎in the gene expression analysis shown in Figure 5. But the calcium indicator may show transient ‎intensities depending on spontaneous neural activity during the imaging. This will introduce a ‎significant variability that may affect the expression correlation analysis as depicted in Figure 5.‎

    3. Reviewer #2 (Public Review):

      The authors succeed in generalizing the pre-alignment procedure for their cell identification method to allow it to work effectively on data with only small subsets of cells labeled. They convincingly show that their extension accurately identifies head angle, based on finding auto fluorescent tissue and looking for a symmetric l/r axis. They demonstrate that the method works to identify known subsets of neurons with varying accuracy depending on the nature of underlying atlas data. Their approach should be a useful one for researchers wishing to identify subsets of head neurons in C. elegans, for example in whole brain recording, and the ideas might be useful elsewhere.

      The authors also strive to give some general insights on what makes a good atlas. It is interesting and valuable to see (at least for this specific set of neurons) that 5-10 ideal examples are sufficient. However, some critical details would help in understanding how far their insights generalize. I believe the set of neurons in each atlas version are matched to the known set of cells in the sparse neuronal marker, however this critical detail isn't explicitly stated anywhere I can see. In addition, it is stated that some neuron positions are missing in the neuropal data and replaced with the (single) position available from the open worm atlas. It should be stated how many neurons are missing and replaced in this way (providing weaker information). It also is not explicitly stated that the putative identities for the uncertain cells (designated with Greek letters) are used to sample the neuropal data. Large numbers of openworm single positions or if uncertain cells are misidentified forcing alignment against the positions of nearby but different cells would both handicap the neuropal atlas relative to the matched florescence atlas. This is an important question since sufficient performance from an ideal neuropal atlas (subsampled) would avoid the need for building custom atlases per strain.

    1. eLife assessment

      This valuable study describes engineered dengue virus variants that can be used to dissect epitope specificities in polyclonal sera, and to design candidate vaccine antigens that dampen antibody responses against undesirable epitopes. The evidence supporting the major claims is solid, although experiments to distinguish the impact on antibody binding versus neutralizing activities would have strengthened the study. This work will be of interest to virologists and structural biologists working on antibody responses to flaviviruses.

    2. Reviewer #1 (Public Review):

      Summary of the major findings -

      1. The authors used saturation mutagenesis and directed evolution to mutate the highly conserved fusion loop (98 DRGWGNGCGLFGK 110) of the Envelope (E) glycoprotein of Dengue virus (DENV). They created 2 libraries with parallel mutations at amino acids 101, 103, 105-107, and 101-105 respectively. The in vitro transcribed RNA from the two plasmid libraries was electroporated separately into Vero and C6/36 cells and passaged thrice in each of these cells. They successfully recovered a variant N103S/G106L from Library 1 in C6/36 cells, which represented 95% of the sequence population and contained another mutation in E outside the fusion loop (T171A). Library 2 was unsuccessful in either cell type.

      2. The fusion loop mutant virus called D2-FL (N103S/G106L) was created through reverse genetics. Another variant called D2-FLM was also created, which in addition to the fusion loop mutations, also contains a previously published, evolved, and optimized prM-furin cleavage sequence that results in a mature version of the virus (with lower prM content). Both D2-FL and D2-FLM viruses grew comparably to wild type virus in mosquito (C6/36) cells but their infectious titers were 2-2.5 log lower than wild<br /> type virus when grown in mammalian (Vero) cells. These viruses were not compromised in thermostability, and the mechanism for attenuation in Vero cells remains unknown.

      4. Next, the authors probed the neutralization of these viruses using a panel of monoclonal antibodies (mAbs) against fusion loop and domain I, II and III of E protein, and against prM protein. As intended, neutralization by fusion loop mAbs was reduced or impaired for both D2-FL and D2-FLM, compared to wild type DENV2. D2-FLM virus was equivalent to wild type with respect to neutralization by domain I, II, and III antibodies tested (except domain II-C10 mAb) suggesting an intact global antigenic landscape of the mutant virion. As expected, D2-FLM was also resistant to neutralization by prM mAbs (D2-FL was not tested in this batch of experiments).

      5. Finally, the authors evaluated neutralization in the context of polyclonal serum from convalescent humans (n=6) and experimentally infected non-human primates (n=9) at different time points (27 total samples). Homotypic sera (DENV2) neutralized D2-FL, D2-FLM, and wild type DENV similarly, suggesting that the contribution of fusion loop and prM epitopes is insignificant in a serotype-specific neutralization response. However, heterotypic sera (DENV4) neutralized D2-FL and D2-FLM less potently than wild type DENV2, especially at later time points, demonstrating the contribution of fusion loop- and prM-specific antibodies to heterotypic neutralization.

      Impact of the study-

      1. The engineered D2-FL and D2-FLM viruses are valuable reagents to probe antibodies targeting the fusion loop and prM in the overall polyclonal response to DENV.

      2. Though more work is needed, these viruses can facilitate the design of a new generation of DENV vaccine that does not elicit fusion loop- and prM-specific antibodies, which are often poorly neutralizing and lead to antibody-dependent enhancement effect (ADE).

      3. This work can be extended to other members of the flavivirus family.

      4. A broader impact of their work is a reminder that conserved amino acids may not always be critical for function and therefore should not be immediately dismissed in substitution/mutagenesis/protein design efforts.

      Evaluating this study in the context of prior literature -

      The authors write "Although the extreme conservation and critical role in entry have led to it being traditionally considered impossible to change the fusion loop, we successfully tested the hypothesis that massively parallel directed evolution could produce viable DENV fusion-loop mutants that were still capable of fusion and entry, while altering the antigenic footprint."<br /> ".....Previously, a single study on WNV successfully generated a viable virus with a single mutation at the fusion loop, although it severely attenuated neurovirulence. Otherwise, it has not been generated in DENV or other mosquito-borne flaviviruses"

      The above claims are a bit overstated. In the context of other flaviviruses:

      - A previous study applied a similar saturation mutagenesis approach to the *full length* E protein of Zika virus and found that while the conserved fusion loop was mutationally constrained, some mutations, including at amino acid residue 106 were tolerated (PMID 31511387).<br /> - The Japanese encephalitis virus (JEV) SA14-14-2 live vaccine strain contains a L107F mutation in the fusion loop (in addition to other changes elsewhere in the genome) relative to the parental JEV SA14 strain (PMID: 25855730).<br /> - For tickborne encephalitis virus (TBEV-DENV4 chimera), H104G/L107F double mutant has been described (PMID: 8331735)

      There have also been previous examples of functionally tolerated mutations within the DENV fusion loop:

      - Goncalvez et al., isolated an escape variant of DENV 2 using chimpanzee Fab 1A5, with a mutation in the fusion loop G106V (PMID: 15542644). G106 is also mutated in D2-FL clone (N103S/G106L) described in the current study.<br /> - In the context of single-round infectious DENV, mutation at site 102 within the fusion loop has been shown to retain infectivity (PMID 31820734).

      Appraisal of the results -

      The data largely support the conclusions, but some improvements and extensions can benefit the work.

      1. Line 92-93: "This major variant comprised ~95% of the population, while the next most populous variant comprised only 0.25% (Figure 1C)".<br /> What is the sequence of the next most abundant variant?

      2. Lines 94-95: "Residues W101, C105, and L107 were preserved in our final sequence, supporting the structural importance of these residues."<br /> L107F is viable in other flaviviruses.

      3. Figure 2c: The FLM sample in the western blot shows hardly any E protein, making E/prM quantitation unreliable.

      4. Lines 149 -151: "Importantly, D2-FL and D2-FLM were resistant to antibodies targeting the fusion loop. While neutralization by 1M7 is reduced by ~2-logs, no neutralization was observed for 1N5, 1L6, and 4G2 for either variant (Figure 3 A)".

      a) Partial neutralization was observed for 1N5, for D2-FL.<br /> b) Do these mAbs cover the full spectrum of fusion loop antibodies identified thus far in the field?<br /> c) Are the epitopes known for these mAbs? It would be useful to discuss how the epitope of 1M7 differs from the other mAbs? What are the critical residues?<br /> d) Maybe the D2-FL mutant can be further evolved with selection pressure with fusion loop mAbs 1M7 +/-1N5 and/or other fusion loop mAbs.

      5. It would have been useful to include D2-M for comparison (with evolved furin cleavage sequence but no fusion loop mutations).

      6. Data for polyclonal serum can be better discussed. Table 1 is not discussed much in the text. For the R1160-90dpi-DENV4 sample, D2-FL and D2-FLM are neutralized better than wild type DENV2? The authors' interpretation in lines 181-182 is inconsistent with the data presented in Figure 3C, which suggests that over time, there is INCREASED (not waning) dependence on FL- and prM-specific antibodies for heterotypic neutralization.

      Suggestions for further experiments-

      1. It would be interesting to see the phenotype of single mutants N103S and G106L, relative to double mutant N103S/G106L (D2-FL).<br /> 2. The fusion capability of these viruses can be gauged using liposome fusion assay under different pH conditions and different lipids.<br /> 3. Correlative antibody binding vs neutralization data would be useful.

    3. Reviewer #2 (Public Review):

      Antibody-dependent enhancement (ADE) of Dengue is largely driven by cross-reactive antibodies that target the DENV fusion loop or pre-membrane protein. Screening polyclonal sera for antibodies that bind to these cross-reactive epitopes could increase the successful implementation of a safe DENV vaccine that does not lead to ADE. However, there are few reliable tools to rapidly assess the polyclonal sera for epitope targets and ADE potential. Here the authors develop a live viral tool to rapidly screen polyclonal sera for binding to fusion loop and pre-membrane epitopes. The authors performed a deep mutational scan for viable viruses with mutations in the fusion loop (FL). The authors identified two mutations functionally tolerable in insect C6/36 cells, but lead to defective replication in mammalian Vero cells. These mutant viruses, D2-FL and D2-FLM, were tested for epitope presentation with a panel of monoclonal antibodies and polyclonal sera. The D2-FL and D2-FLM viruses were not neutralized by FL-specific monoclonal antibodies demonstrating that the FL epitope has been ablated. However, neutralization data with polyclonal sera is contradictory to the claim that cross-reactive antibody responses targeting the pre-membrane and the FL epitopes wane over time.

      Overall the central conclusion that the engineered viruses can predict epitopes targeted by antibodies is supported by the data and the D2-FL and D2-FLM viruses represent a valuable tool to the DENV research community.

    1. eLife assessment

      This important article provides insights into the neural centres in the Japanese quail brain that are associated with photoperiod-induced life-history states. The physiological and transcriptomic analyses of the mediobasal hypothalamus and pituitary gland offer evidence for a coincidence timing mechanism for measuring day length, which is relevant for the field of circannual biology. Despite some shortcomings in data analysis, the study's convincing experiments and findings have the potential to captivate the attention of molecular and organismal endocrinologists.

    2. Reviewer #1 (Public Review):

      The authors investigated the molecular correlates in potential neural centers in the Japanese quail brain associated with photoperiod-induced life-history states. The authors simulated photoperiod to attain winter and summer-like physiology and samples of neural tissues at spring, and autumn life-history states, daily rhythms in transcripts in solstices and equinox, and lastly studies FSHb transcripts in the pituitary. The experiments are based on a series of changes in photoperiod and gave some interesting results. The experiment did not have a control for no change in photoperiod so it seems possible that endogenous rhythms could be another aspect of seasonal rhythms that lack in this study. The short-day group does not explain the endogenous seasonal response.

      The manuscript would benefit from further clarity in synthesizing different sections. Additionally, there are some instances of unclear language and numerous typos throughout the manuscript. A thorough revision is recommended, including addressing sentence structure for improved clarity, reframing sentences where necessary, correcting typos, conducting a grammar check, and enhancing overall writing clarity.

      Data analysis needs more clarity particularly how transcriptome data explains different physiological measures across seasonal life-history states. It seems the discussion is built around a few genes that have been studied in other published literature on quail seasonal response. Extending results on the promotor of DEGs and building discussion is an extrapolating discussion on limited evidence and seems redundant.

      Last, I wondered if it would be possible to add an ecological context for the frequent change in the photoperiod schedule and not take account of the endogenous annual response. Adding discussion on ecological relevance would make more sense.

    3. Reviewer #2 (Public Review):

      This study is carefully designed and well executed, including a comprehensive suite of endpoint measures and large sample sizes that give confidence in the results. I have a few general comments and suggestions that the authors might find helpful.

      1) I found it difficult to fully grasp the experimental design, including the length of light treatment in the three different experiments (which appears to extend from 2 weeks up to 8 weeks). A graphical description of the experimental design along a timeline would be very helpful to the reader. I suggest adding the respective sample sizes to such a graphic, because this information is currently also difficult to keep track of.

      2) The authors use a lot of terminology that is second nature to a chronobiologist but may be difficult for the general reader to keep track of. For example, what is the difference between "photoinducibility" and "photosensitivity"? Similarly, "vernal" and "autumnal" should be briefly explained at the outset, or maybe simply say "spring equinox" and "fall equinox."

      3) What was the rationale for using only male birds in this study? The authors may want to include a brief discussion on whether the expected results for females might be similar to or different from what they found in males, and why.

      4) The authors used the Bonferroni correction method to account for multiple hypothesis testing of measures of testes mass, body mass, fat score, vimentin immunoreactivity and qPCR analyses in Study 1. I don't think Bonferroni is ever appropriate for biological data: these methods assume that all variables are independent of each other, an assumption that is almost never warranted in biology. In fact, the data show clear relationships between these endpoint measures. Alternatively, one might use Benjamini-Hochberg's FDR correction or various methods for calculating the corrected alpha level.

      5) The graphical interpretations of the results shown in Figure 1n and Figure 3e, along with the hypothesized working model shown in Figure S5, might best be combined into a single figure that becomes part of the Discussion. As is, I do not think these interpretative graphics (which are well done and super helpful!) are appropriate for the Results section.

    4. Reviewer #3 (Public Review):

      It is well known that as seasonal day length increases, molecular cascades in the brain are triggered to ready an individual for reproduction. Some of these changes, however, can begin to occur before the day length threshold is reached, suggesting that short days similarly have the capacity to alter aspects of phenotype. This study seeks to understand the mechanisms by which short days can accomplish this task, which is an interesting and important question in the field of organismal biology and endocrinology.

      The set of studies that this manuscript presents is comprehensive and well-controlled. Many of the effects are also strong and thus offer tantalizing hints about the endo-molecular basis by which short days might stimulate major changes in body condition. Another strength is that the authors put together a compelling model for how different facets of an animal's reproductive state come "on line" as day length increases and spring approaches. In this way, I think the authors broadly fulfill their aims.

      I do, however, also think that there are a few weaknesses that the authors should consider, or that readers should consider when evaluating this manuscript. First, some of the molecular genetic analyses should be interpreted with greater caution. By bioinformatically showing that certain DNA motifs exist within a gene promoter (e.g., FSHbeta), one is not generating robust evidence that corresponding transcription factors actually regulate the expression of the gene in question. In fact, some may argue that this line of evidence only offers weak support for such a conclusion. I appreciate that actually running the laboratory experiments necessary to generate strong support for these types of conclusions is not trivial, and doing so may even be impossible. I would therefore suggest a clear admission of these limitations in the paper.

      Second, I have another issue with the interpretation of data presented in Figure 3. The data show that FSHbeta increases in expression in the 8Lext group, suggesting that endogenous drivers likely act to increase the expression of this gene despite no change in day length. However, more robust effects are reported for FSHbeta expression in the 10v and 12v groups, even compared to the 8Lext group. Doesn't this suggest that both endogenous mechanisms and changes in day length work together to ramp up FSHbeta? The rest of the paper seemed to emphasize endogenous mechanisms and gloss over the fact that such mechanisms likely work additively with other factors. I felt like there was more nuance to these findings than the authors were getting into.

      Third, studies 1 - 3 are well controlled; however, I'm left wondering how much of an effect the transitions in day length might have on the underlying molecular processes that mediate changes in body condition. While the changes in day length are themselves ecologically relevant, the transitions between day length states are not. How do we know, for example, that more gradual changes in day length that occur over long timespans do not produce different effects at the levels of the brain and body? This seemed especially relevant for study 3, where animals experience a rather sudden change in day length. I recognize that these experimental methods are well described in the literature, and they have been used by endocrinologists for a long time; nonetheless, I think questions remain.

    1. eLife assessment

      This paper describes an important, well-organized study into an under-exploited area of spatial transcriptomics. The limitations of the approach are generally made clear, but there is insufficient orthogonal validation to demonstrate the biological significance of the results, which leads to the evidence for the claims being currently incomplete. Nevertheless, the tools presented will provide a resource to researchers wishing to characterise spatial patterning of mRNAs, and the paper will be of interest to researchers studying cell biology, RNA biology, and method development for spatial transcriptomics/proteomics.

    2. Reviewer #1 (Public Review):

      Bierman et al. have developed a set of metrics for measuring the spatial patterning of mRNAs in high-throughput fluorescence in situ hybridisation experiments and applied these to identify a subset of mRNAs whose spatial patterning correlates with 3'UTR length. A strength of the study is the clarity and honesty with which the authors have outlined the strengths and weaknesses of their own approach and reported negative results. A key benefit of the tool is that the methodological choices allow wide applicability to existing datasets. However, these choices also feed into a limitation of the method, which is the difficulty in interpreting the biology underpinning the metrics - raising the question of how users will understand the output of the tool.

    3. Reviewer #2 (Public Review):

      The authors develop SPRAWL (Subcellular Patterning Ranked Analysis With Labels), a statistical framework to identify cell-type specific subcellular RNA localization from multiplexed imaging datasets. The tool is able to assign to each gene and in each annotated cell type, a score (with a p-value) that measures:<br /> - Peripheral/central localization of RNAs within the cell, based on a previous segmentation step defining cell boundaries and the centroid coordinate.<br /> - Radial/punctuate localization of RNAs within the cell

      The method is applied to three multiplexed imaging datasets, identifying defined and cell-type specific patterns for several transcripts.

      In the second part of the manuscript, the authors couple SPRAWL with ReadZS, a computational tool developed by the same group and recently published (Meyer et al, 2022). Starting from single-cell datasets, ReadZS is able to quantify 3'UTR length in each cell type. The authors find a subset of genes showing a positive, or negative correlation between the predicted localization and the predicted 3'UTR length across cell types.

      Strengths:<br /> As the authors state in the introduction, the study of subcellular RNA localization, with the characterization of organizational principles and of molecular regulation mechanisms, is extremely relevant. The authors develop a strategy to detect statistically significant and non-random patterns of RNA sub-cellular localization in MERFISH and SeqFISH+ datasets, i.e. emerging platforms producing spatially resolved maps of hundreds of transcripts with cellular resolution.

      Weaknesses:<br /> Although the method and the presented results have strengths in principle, the main weakness of the paper is that these strengths are not directly demonstrated. That is, insufficient validations are performed to show the biological significance of the results and to fully support the key claims in the manuscript by the data presented.

      In particular, the authors imply that their tool is unique and not comparable to any other method. Therefore there is no comparison of SPRAWL with any other method. For example, a comparison could be made with Baysor (Petukhov, V et al. Nat Biotechnol. https://doi.org/10.1038/s41587-021-01044-w). According to the authors, this method is able to identify "small molecular neighbourhoods with stereotypical transcriptional composition" and provides a "General approach for statistical labeling of spatial data".

      The authors claim that SPRAWL is able to identify spatial patterns of localization and generated relevant hypotheses to be tested, yet the manuscript contains little proof that the results have biological significance (for example association of RNAs with specific subcellular compartments) and there is no experimental validation for the results obtained applying this method.

      The correlation between localization scores and 3'UTR length across cell types for certain genes is also not experimentally validated: results are based on inference from single-cell or imaging data, with no complementary experimental validation.

      It is therefore very difficult to assess the biological relevance of the results produced by SPRAWL.

    4. Reviewer #3 (Public Review):

      Bierman et al. present a novel statistical framework for examining the subcellular localisation of RNA molecules. Subcellular Patterning Ranked Analysis With Labels, SPRAWL, uses the data available in multiplexed single-cell imaging datasets to assign four metrics of localisation patterns to RNA at a gene per cell level. These easy-to-understand scores, ranging from -1 to 1, can be averaged to detect cell-type specific spatial patterns or used in tandem with tools for RNA 3' UTR length or splicing state to determine the correlation between subcellular localisation and RNA isoforms. Such quantitive association between RNA isoforms and localisation provides a useful tool to determine candidate genes for future studies.

      The peripheral and central scores indicate the proximity of RNA molecules to the cell boundary and centre of the cell respectively in relation to other RNA present in the cell. Whilst understanding whether a gene tends to be localised to the cellular membrane is important, it is unclear what biological benefits the central metric gives compared to high "anti-peripheral" scores considering that no single organelle (eg. the nucleus) is located specifically at the centre of the cell in all cell-types.

      The punctate and radial patterning scores provide information on the spatial aggregation of RNA molecules of a given gene within a cell. Whilst the punctate score is easy to understand as simply the distance between RNA, the radial score, the angle between RNA, is harder to understand from the main text and would benefit from a schematic showing how this is in respect to the cell-boundary centroid.

      Despite endeavouring to create a robust statistical measure of RNA subcellular localisation, this paper is full of inconsistencies. Values (eg. Pearson correlation coefficient values, number of significant genes, number of total genes) and names (eg. cell types, gene names) stated throughout the main text and figures/table do not match repeatedly and without fixing these disparities, the conclusions from this paper are hard to believe.

    1. eLife assessment

      Overall, this is an interesting topic of study, and the conclusions could be of relevance more broadly. However, mechanistic support, limited TTF frequencies/timing, and visual support of the quantitative data would be critical in order to provide convincing and rigorous support for this interesting concept.

    2. Reviewer #1 (Public Review):

      This study is based on the hypothesis that tumor treating fields, a form of cancer therapy that exposes tumors to alternating electrical fields, has an effect on tunneling microtubes, fine actin-rich protrusions that connect cancer cells and allow intercellular communication, contributing to the tumor microenvironment and therapeutic resistance. This is an interesting hypothesis and may be of importance. To prove their hypothesis better data presentation and mechanistic studies are needed, as it is not clear based on this study how the proposed effect is working.

    3. Reviewer #2 (Public Review):

      The authors tested TTFields' effect on TNT formation in two mesothelioma cell lines, MSTO-211H and VMAT. The MSTO-211H is a biphasic cell line with epithelioid and sarcomatoid features while VMAT only has sarcomatoid morphology. They treated their cell lines at 150 or 200 kHz either unidirectionally or bidirectionally. The experiments took place within 72 hours of plating, after which the cells will become confluent on coverslips and their TNT formation drops.

      Under these experimental conditions, they found: (i) Unidirectional is more effective than bidirectional TTFields in reducing TNT formation, (ii) TNT formation was markedly reduced after 48 hours of treatment in MSTO-211H but not VMAT cells, (iii) no difference in actin polymerization or actin filament bundling after one hour of TTFields treatment, (iv) reduced TNT formation when TTFields were combined with cisplatin but not with both cisplatin and pemetrexed, (v) analysis TNT cargo transport using markers of gondolas and mitochondria did not show changes in transport velocity, and (vi) in vivo spatial transcriptomic analysis revealed EMT markers and immunogenic markers.

    4. Reviewer #3 (Public Review):

      Sarkari et al. describe the effects of TTFields on inter-cellular communication structures called tunneling nanotubes in malignant pleural mesothelioma cells. Recent studies have implicated these F-actin-based nanotubes in promoting malignant transformation and biology by allowing long-range communications between malignant cells. The authors suggest that TTFields disrupt these structures by impacting the expression of genes involved in nanotube formation and cell proliferation. Although TTFields are thought to affect tubulin-based structures, recent studies suggest that TTFields also impact actin-based structures. Therefore, the authors' findings are in keeping with this new understanding. They also found that TTFields upregulated marker genes in immunity. This is one of the first studies that implicate TTFields in these tunneling nanotube structures. Overall, the study adds to our understanding of TTFields on various cellular structures. However, conclusions are only partially supported by the data presented. The study is largely descriptive and there are many areas that need to be addressed to substantively improve the premise and rigors and strengthen the conclusions.

    1. eLife assessment

      This manuscript suggests a novel mechanism of purifying selection by which programmed cell death contributes to the selective removal of mtDNA deletion mutations in C. elegans. The evidence for this mechanism of removal is strong although questions remain regarding the underlying mechanism and the role of canonical ageing pathways. Because of the likely central role of mtDNA deletions in ageing and age-dependent diseases, this work will be of interest to scientists in the field of mitochondrial biology as well as ageing.

    2. Reviewer #1 (Public Review):

      Flowers et al. studied requirements for the persistence and clearance of mutant mtDNA in C. elegans using the uaDf5 deletion in mtDNA. This mutant mtDNA persists at relatively constant levels, despite clearly having detrimental effects. Surprisingly, no mutations were found in the existing wt copies, which would have otherwise explained the persistence of mutant DNA by complementation. The authors then investigated the potential role of programmed cell death in the removal of mutant mtDNA from the germline using crosses with existing cell death mutants. They observed increased amounts of uaDf5 DNA in 1 day old progeny in strains with mutations in the caspases ced-3 and csp-1 and in several other cell death genes, showing that a significant amount of uaDF5 is removed by PCD in the germline. The authors also observed increased uaDf5 over time in the germline, and effects lifespan mutations on the mount of uaDf5. This was true both for the insulin signaling pathway and the clk-1 pathway, suggesting that both pathways regulate uaDf5 levels consistent with the connection between longevity and mitochondrial homeostasis. Finally, the authors discuss results showing that PCD mutants with high amounts of uaDf5 in the germline, have surprisingly low amounts of uaDf5 in their progeny, which would suggest that PCD can be replaced by another clearance method.

      This manuscript is of general interest because it demonstrates the importance of PCD for clearance of mutant mtDNA. The evidence for this mechanism of removal is strong. The effects of the aging mutants are more difficult to understand and the discussion of these effects is therefore somewhat speculative.

    3. Reviewer #2 (Public Review):

      In this study, the authors sought to elucidate regulators of mitochondrial DNA (mtDNA) quality control in the germline. To this end, the authors used Caenorhabditis elegans as a model organism and 3.1kb mtDNA deletion mutation uaDf5 that is stably transmitted across generations. The key data presented were the heteroplasmy level of mtDNA, specifically the molar ratio of mutant vs. wildtype (WT) mtDNA molecules, at different ages. The authors specifically focused on the role of programmed cell death (PCD) signaling and a few well-known aging pathways in C. elegans. The data showed that attenuation of PCD has the general effect of increasing the steady-state mutant-to-WT ratio, while increasing PCD does not reduce this ratio. The data also showed that this mutant-to-WT ratio increases with age, an effect that is transmitted to progenies, and that perturbations to well-known insulin signaling and CLK-1 aging pathways affect the rate of this increase, where a longer lifespan is correlated with a slower increase. Finally, the data demonstrated an intergenerational reduction in mutant-to-WT ratio and that the degree of this reduction has a nonlinear ultrasensitive-like dependence on the ratio.

      A strength of the study is the comprehensive exploration of the role of key molecules of the PCD machinery in mtDNA quality control in the germline. Also, the data on the effects of age and aging pathways on the maintenance of mtDNA quality in the germline, as well as on intergenerational mtDNA quality control, are extremely interesting and have the potential to trigger transformative studies that connect mtDNA purifying selection and aging.

      A major weakness of the study is that the key findings are predominantly based on data of the mutant-to-WT mtDNA ratio. But, a higher mutant-to-WT ratio does not necessarily equate to an increase/accumulation of mutant mtDNA in the cell population, since the same increase can also be caused by a decrease in WT mtDNA. No data for copy numbers of WT and mutant mtDNA or their proxies were analyzed. As a consequence, some of the major findings, such as the non-canonical/non-apoptotic role of PCD machinery in mediating mitochondrial purifying selection and the accumulation of mutant mtDNA with age, cannot be uniquely concluded from the data. Alternative explanations could be given to explain the observed trends of mutant-to-WT ratios.

      Another weakness is that the connection between the two pathways in this study: PCD and aging, in regulating mtDNA quality control was not more deeply explored. The study did not delve into how the interplay of aging and PCD if any, affects mtDNA quality control in the germline.

      Finally, as the authors noted, the important role of stochasticity in purifying selection against pathogenic mtDNA is established. Yet, this aspect of purifying selection is not explored in this study (e.g., how such stochasticity is working with PCD in mtDNA quality control in the germline), nor it is accounted for in the analysis of the data and the discussion of the observation.

    1. eLife assessment

      This work provides a valuable structural and functional characterization of the neurotransmitter's spatial distribution heterogeneity in cortical and subcortical regions. The authors report a systematic description and annotation of a new "layer" of brain organization that has been relatively poorly integrated with the wider neuroimaging literature to date. In sum, this paper has the potential to be of great interest to a wide audience in neurosciences.

    2. Reviewer #1 (Public Review):

      The work presented here uses a large collection of PET data to discover the principle axes of neurotransmitter receptor/transporter molecule (NTRM) variation in the human cortex and subcortex. These spatial axes are then systematically annotated for their alignment with diverse other measures of brain organization. The work is valuable for providing a systematic description and annotation of a new "layer" of brain organization that has been relatively poorly integrated with the wider neuroimaging literature to date. The methods used are state-of-the-art and the findings generated by these methods are sound. The discovered NTRM gradients will allow others in the field to more easily incorporate information of neurotransmitter maps in their analyses - helping to advance integration between different views of the human brain. A fundamental challenge to this goal of cross-modal integration, however - which doesn't just impact this work, but the field more broadly - is that we are often left to work with spatial correlations between modalities in humans. The lack of access to experimental methods means that the biological basis for observed spatial correlations between different brain features in humans is typically poorly understood. It is therefore hard to know what newly-reported spatial correlations are telling us about brain organization that was not already captured in prior work. Nevertheless, the new resources and results presented here are important because they can guide the future work needed to unpick the biology behind spatially correlated features of the human brain

    3. Reviewer #2 (Public Review):

      In this work, Hänisch and colleagues investigate the relationship between neurotransmitter transporter and receptor's spatial heterogeneity and well-studied functional and structural brain gradients in the human brain. They calculate the spatial similarity between the distribution of the neurotransmitter transporters and receptors for each parcel, thus obtaining a new brain distribution comprising a similarity index of all neurotransmitters mapped to each brain area. They employ a nonlinear dimensionality reduction on this neurotransmitter similarity map to reveal three spatial gradients for cortical and subcortical levels, respectively. Based on this, they characterize their significance by comparing them with functional fMRI meta-analytic activations, MRI microstructure, architectural contextualization, MRI-based structural and functional connectivity, and gray matter atrophy-derived disease maps.

      The claim of the work is broad, and the motivation is general, but the data presented is specific and biologically diverse. The neurotransmitter system operates at different pre- and post-synaptic synaptic levels, and the general assumption that transporters are equivalent to receptors lacks appropriate discussion for supporting this claim. The motivations of the work are very broad, and the analysis used is sufficient for the general claims, but the data presented is specific and biologically diverse.

      Besides these conceptual issues, I find this work interesting as it jointly characterizes the cortical and subcortical PET neurotransmitter's distribution maps and their structural and functional meaning for the first time. In essence, the study presents several arguments to consider the organization of the characterized maps as an additional layer of brain organization. The results are convincing and clearly presented. Although this is a correlative study using unconnected datasets, I appreciate the use of multiple brain maps. I also appreciate that the authors made the data and code available for reproducibility. The data and analysis used in the current draft enable a powerful set of tools for hypothesis testing in the human brain's natural distribution of neurotransmitters beyond the usual pharmacological intervention strategy traditionally used in neurotransmitters' brain mapping area.

    1. eLife assessment

      This important study used high-resolution brain imaging methods to visualize and index non-invasively auditory and language pathways of young children born with inner ear malformations or cochlear nerve dysfunction resulting in profound hearing loss. Nerve fiber impairments were compellingly demonstrated in subcortical auditory and cortical language pathways relative to typically-hearing controls. The results suggested novel approaches for clinical assessment of central auditory and language pathways that may influence different intervention strategies, pending further evidence linking these structural findings with functional outcomes.

    2. Reviewer #1 (Public Review):

      In this study, Wang et al performed structural peripheral and central imaging of the auditory pathway using high-resolution MR. For the first time, they evaluated children with congenital severe to profound sensorineural hearing loss with and without cochlear nerve deficiency and cochlear malformations. The authors evaluated 13 children with severe to profound congenital hearing loss (6 with cochlear nerve deficiency) and 10 typically-hearing controls. They found significant differences in the central auditory pathway that were influenced by the status of the peripheral auditory pathway. Determination of outcomes after cochlear implantation or auditory brainstem implantation is critical and we currently have no good methods for this, so this study is very promising in that regard.

      The authors have achieved their aim of evaluating these children with high-resolution imaging and identifying differences in auditory pathways. My primary issues are that some of their claims for clinical potential are not justified as of yet and the authors did not determine a diagnosis for the patients' hearing loss.

    3. Reviewer #2 (Public Review):

      The aim of this work is to introduce a new pipeline for mapping the human auditory pathway using structural and diffusional MRI, and to examine the brain structural development of children with profound congenital sensorineural hearing loss (SNHL) at both the acoustic processing level and the speech perception level. The authors use this pipeline to investigate the structural development of the auditory-language network for profound SNHL children with normal peripheral structure and those with inner ear malformations and/or cochlear nerve deficiency (IEM&CND). The authors successfully developed a new pipeline for reconstructing the human auditory pathway and used it to investigate the structural development of the auditory-language network in children with profound SNHL. They segmented the subcortical auditory nuclei using super-resolution track density imaging (TDI) maps and T1-weighted images and tracked the auditory and language pathways using probabilistic tractography. The authors found that the language pathway was more sensitive to peripheral auditory condition than the central auditory pathway, highlighting the importance of early intervention for profound SNHL children to provide timely speech inputs. The authors also proposed a comprehensive pre-surgical evaluation extending from the cochlea to the auditory-language network, which has promising clinical potential.

      The major strengths of this work are the use of a new pipeline for mapping the human auditory pathway, the inclusion of children with profound SNHL with and without IEM&CND, and the finding that the language pathway is more sensitive to peripheral auditory condition than the central auditory pathway. However, a limitation of this study is the small sample size, which may limit the generalizability of the findings.

      The results support the conclusions that the language pathway is more sensitive to peripheral auditory condition than the central auditory pathway, highlighting the importance of early intervention for profound SNHL children to provide timely speech inputs.

      This work has the potential to have a significant impact on the field by providing new insights into the structural development of the auditory-language network in children with profound SNHL. The methods and data presented in this work may be useful to the community in developing comprehensive pre-surgical evaluation for children with profound SNHL extending from the cochlea to the auditory-language network.

    4. Reviewer #3 (Public Review):

      This study presents a new pipeline for mapping the auditory-language pathway in children with profound congenital sensorineural hearing loss (SNHL), focusing on those with inner ear malformations and/or cochlear nerve deficiency (IEM&CND). Using structural and diffusional MRI, the researchers investigated the structural fiber properties of the auditory-language networks in affected children under six years old. Findings suggest that the language pathway is more sensitive to peripheral auditory than the central auditory pathway, emphasizing the need for early intervention to provide speech inputs. The study also proposes a comprehensive pre-surgical evaluation from the cochlea to the auditory-language network.

      Strengths:

      1. Investigating fiber properties across various brain network levels (from peripheral structures to central auditory and higher-level language pathways) using high-resolution diffusion imaging and an innovative pipeline.

      2. Evaluating presurgical fiber properties in two subgroups of SNHL children (cochlear implant and auditory brainstem implant candidates) to demonstrate the relationship between peripheral auditory structure damage and the development of auditory-language structural pathways.

      Weaknesses:

      1. Limited sample size: The study analyzed data from 13 SNHL children and 10 normal-hearing children, potentially restricting the validity and reproducibility of the findings, particularly in correlation results based on individual differences.

      2. Lack of speech and language behavioral measures: Although the researchers collected behavioral data post-CI/ABI surgery for most participants, no such data was reported. Consequently, the association between presurgical fiber measures and postsurgical outcomes remains unclear.

      3. Unclear practical implications: The relevance of the presurgical evaluation of the auditory-language network for surgical decision-making and prognosis estimation is not evident, as fiber measures may not correlate with behavioral outcomes.

    1. eLife assessment

      This work provides fundamental new insight into fine axonal morphologies based solely on extracellular action potential recordings. They provide compelling evidence of fine resolution in mapping functional connections between neurons. The work may have broad use in neurobiology, bioengineering, stem cell biology, as well as tissue engineering in functional characterization.

    2. Reviewer #1 (Public Review):

      The authors developed a new approach to enable the reconstruction of fine axonal morphologies based solely on extracellular action potential recordings from in vitro mammalian neurons using a high-density microelectrode array system with an integrated CMOS camera. They provide compelling evidence of fine resolution in mapping functional connections between neurons via very fine axons. The advantage of the approach is that it provides a label-free electrical visualization of axon conduction trajectories as well as the ability to access the AP waveforms. The work may have broad use in neurobiology, bioengineering, stem cell biology, as well as tissue engineering in functional characterization.

    3. Reviewer #2 (Public Review):

      This is a very interesting and compelling paper reporting a method for analyzing the features of action potential conduction in cortical and spinal neurons in vitro using high-density CMOS micro-electrode arrays. The authors report the performances of their detection algorithm allowing them to reconstruct the functional map of single-branching axons. In particular, they compare the functional conduction maps of cortical and spinal axons, and they show that spinal axons display larger spike signals in their distal part compared to cortical axons, but a lower number of branches. In addition, they reveal that spinal axons display a higher conduction velocity compared to cortical ones.

      This study is particularly interesting as it constitutes a compelling methodological report of action potential propagation up to 5-8 mm in single axons in vitro.

    1. Author Response

      Reviewer #2 (Public Review):

      Associative learning assigns valence to sensory cues paired with reward or punishment. Brain regions such as the amygdala in mammals and the mushroom body in insects have been identified as primary sites where valence assignment takes place. However, little is known about the neural mechanisms that translate valence-specific activity in these brain regions into appropriate behavioral actions. This study identifies a small set of upwind neurons (UpWiNs) in the Drosophila brain that receive direct inputs from two mushroom body output neurons (MBONs) representing opposite valences. Through a series of behavioral, imaging, and electrophysiological experiments, the authors show that UpWiNs are differentially regulated by the two MBONs, i.e., inhibited by the glutamatergic MBON-α1(encoding negative valence) while activated by the cholinergic MBON-α3 (encoding positive valence). They also show that UpWiNs control the wind-directed behavior of flies. Activation of UpWiNs is sufficient to drive flies to orient and move upwind, and inhibition of UpWiNs reduces flies' upwind movement toward the source of reward-predicting odors (CS+). These results, together with existing knowledge about the function of the mushroom body in memory processing, suggest an appealing model in which reward learning decreases and increases the responses of MBON-α1 and MBON-α3 to the CS+ odor, respectively, and these changes cause UpWiNs to respond more strongly to the CS+ odor and drive upwind locomotion. Interestingly, in the final part of the results, the authors reveal a wind-independent function of UpWiNs: increasing the probability that flies will revisit the site where UpWiNs were activated. Thus, UpWiNs guide learned reward-seeking behavior with and without airflow. Although the mushroom body has been extensively studied for its role in learning and memory, the downstream neural circuits that read the information from the mushroom body to guide memory-driven behaviors remain poorly characterized. This study provides an important piece of the puzzle for this knowledge gap.

      Strength

      1) Memory studies have predominantly relied on binary choice (go or no-go) assays as measures of memory performance. While these assays are convenient and efficient, they fall short of providing a comprehensive understanding of underlying behavioral structures. In an effort to overcome this limitation, the current study used video recording and tracking software to delve deeper into memory-guided behavior. This innovative approach allowed the authors to uncover novel neurons and examine their contribution to behavior with a level of detail not possible with binary choice assays.

      2) This study used electron microscopy-based Drosophila hemibrain connectome data to reveal the synaptic connection between UpWiNs and MBON-α1 and MBON-α3. Using this method, the study shows that a single UpWiN receives direct input from both MBON-α1 and MBON- α3, which is confirmed by a functional imaging experiment. The connectome dataset also reveals several neurons downstream of UpWiNs, opening avenues for further research into the neural mechanisms linking memory and behavior.

      Weakness

      1) The authors repeatedly state in the manuscript that MBON-α1 and MBON-α3 convey appetitive or aversive memories, respectively. This assertion may not be entirely accurate. Evidence from sugar reward conditioning experiments suggests that MBON-α3 is potentiated and required for sugar reward memory retrieval. Therefore, the compartmentalization for appetitive and aversive memories appears not as obvious at the level of MBONs.

      What we intended was that activation of DANs in these compartments can induce aversive and appetitive memories, respectively, when paired with odors, and that these are the sole output pathway from these compartments to read out the memories in these compartments. As we previously proposed (Aso et al., 2014a eLife), these MBONs can integrate inputs from MBONs of other compartments and their activity can reflect appetitive memory stored as synaptic plasticity in other compartments. Since DANs in the α3 compartment respond to heat, bitter and electric shock but not sugar, the observation that MBON-α3 acquires an enhanced CS+ odor response after appetitive conditioning is presumably due to these intercompartmental connections rather than plasticity of KC-MBON synapses in the α3 compartment. In any case, the fact that excitatory activity of MBON-α1 and MBON-α3 conveys opposite valence of memory still holds true since appetitive conditioning induces depression and potentiation of odor responses, respectively.

      To clarify this point, we now cited related literature in the following sentence in the final paragraph of Introduction: “UpWiNs receive inputs from several types of lateral horn neurons and integrate inhibitory and excitatory inputs from MBON-α1 and MBON-α3, which are the output neurons of MB compartments that store long-lasting appetitive or aversive memories, respectively (Aso and Rubin, 2016; Ichinose et al., 2015; Jacob and Waddell, 2022a; Pai et al., 2013; Yamagata et al., 2015).”

      2) This study did not conclusively establish the importance of the MBON-α1/α3 to UpWiN pathways in memory-driven behavior. In the experiments shown in Figure 5, flies were trained to associate the activation of reward-related DANs with a specific odor (CS+). After conditioning, UpWiNs were observed to show enhanced responses to the CS+ odor. However, the results should be interpreted with caution because the driver line used to activate DANs (R58E02-LexAp65) labels not only DANs projecting to the MBON-α1 compartment, but all DANs in the protocerebral anterior medial (PAM) cluster. Thus, it remains unclear to what extent the observed enhanced responses are influenced by changes in inhibitory inputs from MBON-α1. While UpWiNs have been shown to play a critical role in the expression of sugar reward memory (Figure 7), it should be noted that UpWiNs receive inputs from multiple upstream neurons, making it difficult to accurately assess the contribution of MBON-α1/α3 to UpWiN pathways in UpWiN recruitment. Further research is needed to fully address this issue.

      We totally agree with this point and added a sentence to explain an alternative mechanism. “This enhancement of CS+ response can be most easily explained as an outcome of disinhibition from MBON-α1 whose output had been decreased by memory formation; MBON-α1 is inhibitory to UpWiNs (Figure 4B) and MBON-α1 response to the CS+ is reduced following the same training protocol (Yamada et al. 2023). In addition to such a mechanism, plasticity in the β1 compartment may contribute to the enhanced CS+ response in UpWiNs because the driver R58E02 contains DANs in the β1 and glutamatergic MBON from the β1 directly synapse on the dendrites of MBON-α1 and MBON-α3. “

      3) UpWind neurons (UpWiNs) were so named because their activation promotes upwind locomotion. However, when activated in the absence of airflow, flies show increased locomotor speed and an increased probability of revisiting the same location (Figure 7 and Figure 7-figure supplement 1). The revisiting behavior can be observed during the activation of UpWiNs, which is distinct from the local search behavior that typically begins after a reward stimulus is turned off (e.g., Gr64f-GAL4 results in Figure 7-figure supplement 1).

      Return probability was calculated within a 15-s time window. High return probability during LED ON period (10-20s) in Figure 7-figure supplement 1 does not necessarily mean that flies returned during LED ON period. If a fly is at the position A when t=10s, to be counted as “returned”, it needs to move more than 10mm away from A and move back to the position less than 3mm distance from A by t=25s. In the case of sugar sensory neuron activation with Gr64f-GAL4, the peak of return probability is shifted toward a later time point because flies stop and extend proboscis during activation period.

      Because revisiting a location can also be a consequence of repeated turns, it seems more accurate to describe UpWiNs as controlling the speed and likelihood of turns and promoting upwind movement by integrating with neurons that sense the direction of airflow.

      The return probability plotted in Figure 7E is probability of return to the position at the end of LED period within 15s post LED period when angular speed of SS33917>CsChrimson and SS33918>CsChrimson flies are identical to empty-split-GAL4>CsChrimson control flies (Figure 7-figure supplement 1). Thus, revisiting behavior cannot be explained by a simple increase in turing probability.

      Although functions of UpWiNs are not limited to promotion of wind-directed walking, we still think that the “UpWind Neurons” is a practical name for broad readers and oral communications at the current stage of investigations, because EM neuron IDs and names (SMP348, SMP353, SMP354, SLP399 and SLP400) are too lengthy and do not contain any functional information. We initially defined a set of 11 neurons labeled by SS33197 split-GAL4 as “UpWind Neurons (UpWiNs)” based on initial optogenetic screening (Figure 2A). We found other driver lines for mushroom body interneuron cell types that can promote release of dopamine and more robust returning phenotype (e.g. SS49755), but SS33917 remained to be the champion driver line for upwind locomotion phenotype.

      Reviewer #3 (Public Review):

      Aso et al. provide insight into how learned valences are transformed into concrete memory-driven actions, using a diverse set of proven techniques.

      Here the authors use a four-armed arena to evaluate flies' preference for a reward-predicting odor and measure upwind locomotion. This behavioral paradigm was combined with the photoactivation of different memory-eliciting neurons, revealing that appetitive memories stored in different compartments of the mushroom bodies (center of olfactory memory) induce different levels of upwind locomotion. The authors then proceed to a non-exhaustive optogenetic screen of the neurons located downstream of the output neurons of the mushroom bodies (MBONs) and identify a group of 8-11 Cholinergic neurons promoting significant changes in upwind locomotion, the UpWins. By combining confocal immunolabelling of these neurons with electron microscope images, they manage to establish the UpWins' connectome within themselves and with the MBONs. Then, using two in vivo cell recording techniques, electrophysiology, and calcium imaging, they define that UpWins integrate both inhibitory and excitatory synaptic inputs from the MBONs encoding appetitive and aversive memory, respectively. In addition, they show that the UpWins' response to a reward-predicting odor is increased after appetitive training. On a behavioral level, the authors establish that the UpWins respond to wind direction only and are not involved in lower-level motor parameters, such as turning direction and acceleration. Finally, they demonstrate that the UpWins' activity is necessary for long-term appetitive memory retrieval, and even suggest a broader role for the UpWins in olfactory navigation, as their photoactivation increases the probability of revisiting behavior. In the end, the authors state that they provide new insights into how memory is translated into concrete behavior, which is fully supported by their data. Altogether, the authors present a pretty complete study that provides very interesting and reliable data, and that opens a new field of investigation into memory-driven behaviors.

      Strengths of the study:

      • To support their conclusions, the authors provide detailed data from different levels of analysis (behavioral, cellular, and molecular), using multiple sophisticated techniques.

      • The measurement of multiple parameters in the behavioral analysis supports the strong changes in upwind locomotion. In addition, taken individually these parameters provide precise insights into how upwind locomotion changes, and allow the authors to more precisely define the role of the UpWins.

      • The authors use split-Gal4 drivers instead of Gal4, allowing them to better refine neuron labelling.

      The authors discussed and investigated all possible biases, making their data very reliable. For example, they demonstrated that the phenotypes observed in the behavioral assay were wind-directed behaviors and could not be explained by bias avoidance of the arena's center area.

      Limitations of the study:

      • In the absence of more precise drivers, the UpWins' labelling lacks precision. For example, there is no way to know exactly which UpWin is responding in the electrophysiological experiment presented in Figure 4.

      We have ongoing efforts to generate split-GAL4 and split-LexA driver lines for specific subsets of UpWiN neurons, but the data using those lines are not ready for this manuscript. However, we would like to point out that historically, identification of a group of neurons with striking phenotype has been foundational to promote follow-up studies. A good example is P1 neurons for courtship behavior.

      • The screening of neurons located downstream of the MBONs is not exhaustive, meaning that other groups of neurons might be involved in memory-driven upwind locomotion. Although, it does not diminish the authors' conclusions.

      The UpWiNs is certainly not the only one cell type for mediating memory-driven upwind locomotion, since our and other groups’ studies (e.g. Matheson et al., 2022; PMCID: PMC9360402) identified a collection of cell types that can promote upwind locomotion upon optogenetic activation.

      In 2021, we released images and driver lines of a larger collection of split-GAL4 driver lines at https://splitgal4.janelia.org. We are preparing a manuscript to provide anatomical descriptions of these lines. This collection of new drivers will help elucidate more comprehensive views of circuits for memory-driven actions.

      • All data were obtained with walking flies. So far, there have been no experiments on flying flies.

      This is an intriguing question and we mentioned in Discussion that “Our study was limited to walking behaviors, and the role of UpWiNs in flight behaviors remains to be investigated.”

    1. Author Response

      Reviewer #1 (Public Review):

      The authors present a PyTorch-based simulator for prosthetic vision. The model takes in the anatomical location of a visual cortical prostheses as well as a series of electrical stimuli to be applied to each electrode, and outputs the resulting phosphenes. To demonstrate the usefulness of the simulator, the paper reproduces psychometric curves from the literature and uses the simulator in the loop to learn optimized stimuli.

      One of the major strengths of the paper is its modeling work - the authors make good use of existing knowledge about retinotopic maps and psychometric curves that describe phosphene appearance in response to single-electrode stimulation. Using PyTorch as a backbone is another strength, as it allows for GPU integration and seamless integration with common deep learning models. This work is likely to be impactful for the field of sight restoration.

      1) However, one of the major weaknesses of the paper is its model validation - while some results seem to be presented for data the model was fit on (as opposed to held-out test data), other results lack quantitative metrics and a comparison to a baseline ("null hypothesis") model. On the one hand, it appears that the data presented in Figs. 3-5 was used to fit some of the open parameters of the model, as mentioned in Subsection G of the Methods. Hence it is misleading to present these as model "predictions", which are typically presented for held-out test data to demonstrate a model's ability to generalize. Instead, this is more of a descriptive model than a predictive one, and its ability to generalize to new patients remains yet to be demonstrated.

      We agree that the original presentation of the model fits might give rise to unwanted confusion. In the revision, we have adapted the fit of the thresholding mechanism to include a 3-fold cross validation, where part of the data was excluded during the fitting, and used as test sets to calculate the model’s performance. The results of the cross- validation are now presented in panel D of Figure 3. The fitting of the brightness and temporal dynamics parameters using cross-validation was not feasible due to the limited amount of quantitative data describing temporal dynamics and phosphene size and brightness for intracortical electrodes. To avoid confusion, we have adapted the corresponding text and figure captions to specify that we are using a fit as description of the data.

      We note that the goal of the simulator is not to provide a single set of parameters that describes precise phosphene perception for all patients but that it could also be used to capture variability among patients. Indeed, the model can be tailored to new patients based on a small data set. Figure 3-figure supplement 1 exemplifies how our simulator can be tailored to several data sets collected from patients with surface electrodes. Future clinical experiments might be used to verify how well the simulator can be tailored to the data of other patients.

      Specifically, we have made the following changes to the manuscript:

      • Caption Figure 2: the fitted peak brightness levels reproduced by our model

      • Caption Figure 3: The model's probability of phosphene perception is visualized as a function of charge per phase

      • Caption Figure 3: Predicted probabilities in panel (d) are the results of a 3-fold cross- validation on held-out test data.

      • Line 250: we included biologically inspired methods to model the perceptual effects of different stimulation parameters

      • Line 271: Each frame, the simulator maps electrical stimulation parameters (stimulation current, pulse width and frequency) to an estimated phosphene perception

      • Lines 335-336: such that 95% of the Gaussian falls within the fitted phosphene size.

      • Line 469-470: Figure 4 displays the simulator's fit on the temporal dynamics found in a previous published study by Schmidt et al. (1996).

      • Lines 922-925: Notably, the trade-off between model complexity and accurate psychophysical fits or predictions is a recurrent theme in the validation of the components implemented in our simulator.

      2) On the other hand, the results presented in Fig. 8 as part of the end-to-end learning process are not accompanied by any sorts of quantitative metrics or comparison to a baseline model.

      We now realize that the presentation of the end-to-end results might have given the impression that we present novel image processing strategies. However, the development of a novel image processing strategy is outside the scope of the study. Instead, The study aims to provide an improved simulation which can be used for more realistic assessment of different stimulation protocols. The simulator needs to fit experimental data, and it should run fast (so it can be used in behavioral experiments). Importantly, as demonstrated in our end-to-end experiments, the model can be used in differentiable programming pipelines (so it can be used in computational optimization experiments), which is a valuable contribution in itself because it lends itself to many machine learning approaches which can improve the realism of the simulation.

      We have rephrased our study aims in the discussion to improve clarity.

      • Lines 275-279: In the sections below, we discuss the different components of the simulator model, followed by a description of some showcase experiments that assess the ability to fit recent clinical data and the practical usability of our simulator in simulation experiments

      • Lines 810-814: Computational optimization approaches can also aid in the development of safe stimulation protocols, because they allow a faster exploration of the large parameter space and enable task-driven optimization of image processing strategies (Granley et al., 2022; Fauvel et al., 2022; White et al., 2019; Küçükoglü et al. 2022; de Ruyter van Steveninck et al., 2022; Ghaffari et al., 2021).

      • Lines 814-819: Ultimately, the development of task-relevant scene-processing algorithms will likely benefit both from computational optimization experiments as well as exploratory SPV studies with human observers. With the presented simulator we aim to contribute a flexible toolkit for such experiments.

      • Lines 842-853: Eventually, the functional quality of the artificial vision will not only depend on the correspondence between the visual environment and the phosphene encoding, but also on the implant recipient's ability to extract that information into a usable percept. The functional quality of end-to-end generated phosphene encodings in daily life tasks will need to be evaluated in future experiments. Regardless of the implementation, it will always be important to include human observers (both sighted experimental subjects and actual prosthetic implant users in the optimization cycle to ensure subjective interpretability for the end user (Fauvel et al., 2022; Beyeler & Sanchez-Garcia, 2022).

      3) The results seem to assume that all phosphenes are small Gaussian blobs, and that these phosphenes combine linearly when multiple electrodes are stimulated. Both assumptions are frequently challenged by the field. For all these reasons, it is challenging to assess the potential and practical utility of this approach as well as get a sense of its limitations.

      The reviewer raises a valid point and a similar point was raised by a different reviewer (our response is duplicated). As pointed out in the discussion, many aspects about multi- electrode phosphene perception are still unclear. On the one hand, the literature is in agreement that there is some degree of predictability: some papers explicitly state that phosphenes produced by multiple patterns are generally additive (Dobelle & Mladejovsky, 1974), that the locations are predictable (Bosking et al., 2018) and that multi-electrode stimulation can be used to generate complex, interpretable patterns of phosphenes (Chen et al., 2020, Fernandez et al., 2021). On the other hand, however, in some cases, the stimulation of multiple electrodes is reported to lead to brighter phosphenes (Fernandez et al., 2021), fused or displaced phosphenes (Schmidt et al., 1996, Bak et al., 1990) or unpredicted phosphene patterns (Fernández et al., 2021). It is likely that the probability of these interference patterns decreases when the distance between the stimulated electrodes increases. An empirical finding is that the critical distance for intracortical stimulation is approximately 1 mm (Ghose & Maunsell, 2012).

      We note that our simulator is not restricted to the simulation of linearly combined Gaussian blobs. Some irregularities, such as elongated phosphene shapes were already supported in the previous version of our software. Furthermore, we added a supplementary figure that displays a possible approach to simulate some of the more complex electrode interactions that are reported in the literature, with only minor adaptations to the code. Our study thereby aims to present a flexible simulation toolkit that can be adapted to the needs of the user.

      Adjustments:

      • Added Figure 1-figure supplement 3 on irregular phosphene percepts.

      • Lines 957-970: Furthermore, in contrast to the assumptions of our model, interactions between simultaneous stimulation of multiple electrodes can have an effect on the phosphene size and sometimes lead to unexpected percepts (Fernandez et al., 2021, Dobelle & Mladejovsky 1974, Bak et al., 1990). Although our software supports basic exploratory experimentation of non-linear interactions (see Figure 1-figure supplement 3), by default, our simulator assumes independence between electrodes. Multi- phosphene percepts are modeled using linear summation of the independent percepts. These assumptions seem to hold for intracortical electrodes separated by more than 1 mm (Ghose & Maunsell, 2012), but may underestimate the complexities observed when electrodes are nearer. Further clinical and theoretical modeling work could help to improve our understanding of these non-linear dynamics.

      4) Another weakness of the paper is the term "biologically plausible", which appears throughout the manuscript but is not clearly defined. In its current form, it is not clear what makes this simulator "biologically plausible" - it certainly contains a retinotopic map and is fit on psychophysical data, but it does not seem to contain any other "biological" detail.

      We thank the reviewer for the remark. We improved our description of what makes the simulator “biologically plausible” in the introduction (line 78): ‘‘Biological plausibility, in our work's context, points to the simulation's ability to capture essential biological features of the visual system in a manner consistent with empirical findings: our simulator integrates quantitative findings and models from the literature on cortical stimulation in V1 [...]”. In addition, we mention in the discussion (lines 611 - 621): “The aim of this study is to present a biologically plausible phosphene simulator, which takes realistic ranges of stimulation parameters, and generates a phenomenologically accurate representation of phosphene vision using differentiable functions. In order to achieve this, we have modeled and incorporated an extensive body of work regarding the psychophysics of phosphene perception. From the results presented in section H, we observe that our simulator is able to produce phosphene percepts that match the descriptions of phosphene vision that were gathered in basic and clinical visual neuroprosthetics studies over the past decades.”

      5) In fact, for the most part the paper seems to ignore the fact that implanting a prosthesis in one cerebral hemisphere will produce phosphenes that are restricted to one half of the visual field. Yet Figures 6 and 8 present phosphenes that seemingly appear in both hemifields. I do not find this very "biologically plausible".

      We agree with the reviewer that contemporary experiments with implantable electrodes usually test electrodes in a single hemisphere. However, future clinically useful approaches should use bilaterally implanted electrode arrays. Our simulator can either present phosphene locations in either one or both hemifields.

      We have made the following textual changes:

      • Fig. 1 caption: Example renderings after initializing the simulator with four 10 × 10 electrode arrays (indicated with roman numerals) placed in the right hemisphere (electrode spacing: 4 mm, in correspondence with the commonly used 'Utah array' (Maynard et al., 1997)).

      • Line 518-525: The simulator is initialized with 1000 possible phosphenes in both hemifields, covering a field of view of 16 degrees of visual angle. Note that the simulated electrode density and placement differs from current prototype implants and the simulation can be considered to be an ambitious scenario from a surgical point of view, given the folding of the visual cortex and the part of the retinotopic map in V1 that is buried in the calcarine sulcus. Line 546-547: with the same phosphene coverage as the previously described experiment

      Reviewer #2 (Public Review):

      Van der Grinten and De Ruyter van Steveninck et al. present a design for simulating cortical- visual-prosthesis phosphenes that emphasizes features important for optimizing the use of such prostheses. The characteristics of simulated individual phosphenes were shown to agree well with data published from the use of cortical visual prostheses in humans. By ensuring that functions used to generate the simulations were differentiable, the authors permitted and demonstrated integration of the simulations into deep-learning algorithms. In concept, such algorithms could thereby identify parameters for translating images or videos into stimulation sequences that would be most effective for artificial vision. There are, however, limitations to the simulation that will limit its applicability to current prostheses.

      The verification of how phosphenes are simulated for individual electrodes is very compelling. Visual-prosthesis simulations often do ignore the physiologic foundation underlying the generation of phosphenes. The authors' simulation takes into account how stimulation parameters contribute to phosphene appearance and show how that relationship can fit data from actual implanted volunteers. This provides an excellent foundation for determining optimal stimulation parameters with reasonable confidence in how parameter selections will affect individual-electrode phosphenes.

      We thank the reviewer for these supportive comments.

      Issues with the applicability and reliability of the simulation are detailed below:

      1) The utility of this simulation design, as described, unfortunately breaks down beyond the scope of individual electrodes. To model the simultaneous activation of multiple electrodes, the authors' design linearly adds individual-electrode phosphenes together. This produces relatively clean collections of dots that one could think of as pixels in a crude digital display. Modeling phosphenes in such a way assumes that each electrode and the network it activates operate independently of other electrodes and their neuronal targets. Unfortunately, as the authors acknowledge and as noted in the studies they used to fit and verify individual-electrode phosphene characteristics, simultaneous stimulation of multiple electrodes often obscures features of individual-electrode phosphenes and can produce unexpected phosphene patterns. This simulation does not reflect these nonlinearities in how electrode activations combine. Nonlinearities in electrode combinations can be as subtle the phosphenes becoming brighter while still remaining distinct, or as problematic as generating only a single small phosphene that is indistinguishable from the activation of a subset of the electrodes activated, or that of a single electrode.

      If a visual prosthesis happens to generate some phosphenes that can be elicited independently, a simulator of this type could perhaps be used by processing stimulation from independent groups of electrodes and adding their phosphenes together in the visual field.

      The reviewer raises a valid point and a similar point was raised by a different reviewer (our response is duplicated). As pointed out in the discussion, many aspects about multi- electrode phosphene perception are still unclear. On the one hand, the literature is in agreement that there is some degree of predictability: some papers explicitly state that phosphenes produced by multiple patterns are generally additive (Dobelle & Mladejovsky, 1974), that the locations are predictable (Bosking et al., 2018) and that multi-electrode stimulation can be used to generate complex, interpretable patterns of phosphenes (Chen et al., 2020, Fernandez et al., 2021). On the other hand, however, in some cases, the stimulation of multiple electrodes is reported to lead to brighter phosphenes (Fernandez et al., 2021), fused or displaced phosphenes (Schmidt et al., 1996, Bak et al., 1990) or unpredicted phosphene patterns (Fernández et al., 2021). It is likely that the probability of these interference patterns decreases when the distance between the stimulated electrodes increases. An empirical finding is that the critical distance for intracortical stimulation is approximately 1 mm (Ghose & Maunsell, 2012).

      We note that our simulator is not restricted to the simulation of linearly combined Gaussian blobs. Some irregularities, such as elongated phosphene shapes were already supported in the previous version of our software. Furthermore, we added a supplementary figure that displays a possible approach to simulate some of the more complex electrode interactions that are reported in the literature, with only minor adaptations to the code. Our study thereby aims to present a flexible simulation toolkit that can be adapted to the needs of the user.

      Adjustments:

      • Lines 957-970: Furthermore, in contrast to the assumptions of our model, interactions between simultaneous stimulation of multiple electrodes can have an effect on the phosphene size and sometimes lead to unexpected percepts (Fernandez et al., 2021, Dobelle & Mladejovsky 1974, Bak et al., 1990). Although our software supports basic exploratory experimentation of non-linear interactions (see Figure 1-figure supplement 3), by default, our simulator assumes independence between electrodes. Multi- phosphene percepts are modeled using linear summation of the independent percepts. These assumptions seem to hold for intracortical electrodes separated by more than 1 mm (Ghose & Maunsell, 2012), but may underestimate the complexities observed when electrodes are nearer. Further clinical and theoretical modeling work could help to improve our understanding of these non-linear dynamics.

      • Added Figure 1-figure supplement 3 on irregular phosphene percepts.

      2) Verification of how the simulation renders individual phosphenes based on stimulation parameters is an important step in confirming agreement between the simulation and the function of implanted devices. That verification was well demonstrated. The end use a visual-prosthesis simulation, however, would likely not be optimizing just the appearance of phosphenes, but predicting and optimizing functional performance in visual tasks. Investigating whether this simulator can suggest visual-task performance, either with sighted volunteers or a decoder model, that is similar to published task performance from visual-prosthesis implantees would be a necessary step for true validation.

      We agree with the reviewer that it will be vital to investigate the utility of the simulator in tasks. However, the literature on the performance of users of a cortical prosthesis in visually-guided tasks is scarce, making it difficult to compare task performance between simulated versus real prosthetic vision.

      Secondly, the main objective of the current study is to propose a simulator that emulates the sensory / perceptual experience, i.e. the low-level perceptual correspondence. Once more behavioral data from prosthetic users become available, studies can use the simulator to make these comparisons.

      Regarding the comparison to simulated prosthetic vision in sighted volunteers, there are some fundamental limitations. For instance, sighted subjects are exposed for a shorter duration to the (simulated) artificial percept and lack the experience and training that prosthesis users get. Furthermore, sighted subjects may be unfamiliar with compensation strategies that blind individuals have developed. It will therefore be important to conduct clinical experiments.

      To convey more clearly that our experiments are performed to verify the practical usability in future behavioral experiments, we have incorporated the following textual adjustments:

      • Lines 275-279: In the sections below, we discuss the different components of the simulator model, followed by a description of some showcase experiments that assess the ability to fit recent clinical data and the practical usability of our simulator in simulation experiments.

      • Lines 842-853: Eventually, the functional quality of the artificial vision will not only depend on the correspondence between the visual environment and the phosphene encoding, but also on the implant recipient's ability to extract that information into a usable percept. The functional quality of end-to-end generated phosphene encodings in daily life tasks will need to be evaluated in future experiments. Regardless of the implementation, it will always be important to include human observers (both sighted experimental subjects and actual prosthetic implant users in the optimization cycle to ensure subjective interpretability for the end (Fauvel et al., 2022; Beyeler & Sanchez- Garcia, 2022).

      3) A feature of this simulation is being able to convert stimulation of V1 to phosphenes in the visual field. If used, this feature would likely only be able to simulate a subset of phosphenes generated by a prosthesis. Much of V1 is buried within the calcarine sulcus, and electrode placement within the calcarine sulcus is not currently feasible. As a result, stimulation of visual cortex typically involves combinations of the limited portions of V1 that lie outside the sulcus and higher visual areas, such as V2.

      We agree that some areas (most notably the calcarine sulcus) are difficult to access in a surgical implantation procedure. A realistic simulation of state-of-the-art cortical stimulation should only partially cover the visual field with phosphenes. However, it may be predicted that some of these challenges will be addressed by new technologies. We chose to make the simulator as generally applicable as possible and users of the simulator can decide which phosphene locations are simulated. To demonstrate that our simulator can be flexibly initialized to simulate specific implantation locations using third- party software, we have now added a supplementary figure (Figure 1-figure supplement 1) that displays a demonstration of an electrode grid placement on a 3D brain model, generating the phosphene locations from receptive field maps. However, the simulator is general and can also be used to guide future strategies that aim to e.g. cover the entire field with electrodes, compare performance between upper and lower hemifields etc.

      Reviewer #3 (Public Review):

      The authors are presenting a new simulation for artificial vision that incorporates many recent advances in our understanding of the neural response to electrical stimulation, specifically within the field of visual prosthetics. The authors succeed in integrating multiple results from other researchers on aspects of V1 response to electrical stimulation to create a system that more accurately models V1 activation in a visual prosthesis than other simulators. The authors then attempt to demonstrate the value of such a system by adding a decoding stage and using machine-learning techniques to optimize the system to various configurations.

      1) While there is merit to being able to apply various constraints (such as maximum current levels) and have the system attempt to find a solution that maximizes recoverable information, the interpretability of such encodings to a hypothetical recipient of such a system is not addressed. The authors demonstrate that they are able to recapitulate various standard encodings through this automated mechanism, but the advantages to using it as opposed to mechanisms that directly detect and encode, e.g., edges, are insufficiently justified.

      We thank the reviewer for this constructive remark. Our simulator is designed for more realistic assessment of different stimulation protocols in behavioral experiments or in computational optimization experiments. The presented end-to-end experiments are a demonstration of the practical usability of our simulator in computational experiments, building on a previously existing line of research. In fact, our simulator is compatible with any arbitrary encoding strategy.

      As our paper is focused on the development of a novel tool for this existing line of research, we do not aim to make claims about the functional quality of end-to-end encoders compared to alternative encoding methods (such as edge detection). That said, we agree with the reviewer that it is useful to discuss the benefits of end-to-end optimization compared to e.g. edge detection will be useful.

      We have incorporated several textual changes to give a more nuanced overview and to acknowledge that many benefits remain to be tested. Furthermore, we have restated our study aims more clearly in the discussion to clarify the distinction between the goals of the current paper and the various encoding strategies that remain to be tested.

      • Lines 275-279: In the sections below, we discuss the different components of the simulator model, followed by a description of some showcase experiments that assess the ability to fit recent clinical data and the practical usability of our simulator in simulation experiments

      • Lines 810-814: Computational optimization approaches can also aid in the development of safe stimulation protocols, because they allow a faster exploration of the large parameter space and enable task-driven optimization of image processing strategies (Granley et al., 2022; Fauvel et al., 2022; White et al., 2019; Küçükoglü et al. 2022; de Ruyter van Steveninck, Güçlü et al., 2022; Ghaffari et al., 2021).

      • Lines 842-853: Eventually, the functional quality of the artificial vision will not only depend on the correspondence between the visual environment and the phosphene encoding, but also on the implant recipient's ability to extract that information into a usable percept. The functional quality of end-to-end generated phosphene encodings in daily life tasks will need to be evaluated in future experiments. Regardless of the implementation, it will always be important to include human observers (both sighted experimental subjects and actual prosthetic implant users in the optimization cycle to ensure subjective interpretability for the end user (Fauvel et al., 2022; Beyeler & Sanchez-Garcia, 2022).

      2) The authors make a few mistakes in their interpretation of biological mechanisms, and the introduction lacks appropriate depth of review of existing literature, giving the reader the mistaken impression that this is simulator is the only attempt ever made at biologically plausible simulation, rather than merely the most recent refinement that builds on decades of work across the field.

      We thank the reviewer for this insight. We have improved the coverage of the previous literature to give credit where credit is due, and to address the long history of simulated phosphene vision.

      Textual changes:

      • Lines 64-70: Although the aforementioned SPV literature has provided us with major fundamental insights, the perceptual realism of electrically generated phosphenes and some aspects of the biological plausibility of the simulations can be further improved and by integrating existing knowledge of phosphene vision and its underlying physiology.

      • Lines 164-190: The aforementioned studies used varying degrees of simplification of phosphene vision in their simulations. For instance, many included equally-sized phosphenes that were uniformly distributed over the visual field (informally referred to as the ‘scoreboard model’). Furthermore, most studies assumed either full control over phosphene brightness or used binary levels of brightness (e.g. 'on' / 'off'), but did not provide a description of the associated electrical stimulation parameters. Several studies have explicitly made steps towards more realistic phosphene simulations, by taking into account cortical magnification or using visuotopic maps (Fehervari et al., 2010;, Li et al., 2013; Srivastava et al., 2009; Paraskevoudi et al., 2021), simulating noise and electrode dropout (Dagnelie et al., 2007), or using varying levels of brightness (Vergnieux et al., 2017; Sanchez-Garcia et al., 2022; Parikh et al., 2013). However, no phosphene simulations have modeled temporal dynamics or provided a description of the parameters used for electrical stimulation. Some recent studies developed descriptive models of the phosphene size or brightness as a function of the stimulation parameters (Winawer et al., 2016; Bosking et al., 2017). Another very recent study has developed a deep-learning based model for predicting a realistic phosphene percept for single stimulating electrodes (Granley et al., 2022). These studies have made important contributions to improve our understanding of the effects of different stimulation parameters. The present work builds on these previous insights to provide a full simulation model that can be used for the functional evaluation of cortical visual prosthetic systems.

      • Lines 137-140: Due to the cortical magnification (the foveal information is represented by a relatively large surface area in the visual cortex as a result of variation of retinal RF size) the size of the phosphene increases with its eccentricity (Winawer & Parvizi, 2016, Bosking et al., 2017).

      • Lines 883-893: Even after loss of vision, the brain integrates eye movements for the localization of visual stimuli (Reuschel et al., 2012), and in cortical prostheses the position of the artificially induced percept will shift along with eye movements (Brindley & Lewin, 1968, Schmidt et al., 1996). Therefore, in prostheses with a head-mounted camera, misalignment between the camera orientation and the pupillary axes can induce localization problems (Caspi et al., 2018; Paraskevoudi & Pezaris, 2019; Sabbah et al., 2014; Schmidt et al., 1996). Previous SPV studies have demonstrated that eye-tracking can be implemented to simulate the gaze-coupled perception of phosphenes (Cha et al., 1992; Sommerhalder et al., 2004; Dagnelie et al., 2006; McIntosh et al., 2013, Paraskevoudi & Pezaris, 2021; Rassia & Pezaris 2018, Titchener et al., 2018, Srivastava et al., 2009)

      3) The authors have importantly not included gaze position compensation which adds more complexity than the authors suggest it would, and also means the simulator lacks a basic, fundamental feature that strongly limits its utility.

      We agree with the reviewer that the inclusion of gaze position to simulate gaze-centered phosphene locations is an important requirement for a realistic simulation. We have made several textual adjustments to section M1 to improve the clarity of the explanation and we have added several references to address the simulation literature that took eye movements into account.

      In addition, we included a link to some demonstration videos in which we illustrate that the simulator can be used for gaze-centered phosphene simulation. The simulation models the phosphene locations based on the gaze direction, and updates the input with changes in the gaze direction. The stimulation pattern is chosen to encode the visual environment at the location where the gaze is directed. Gaze contingent processing has been implemented in prior simulation studies (for instance: Paraskevoudi et al., 2021; Rassia et al., 2018; Titchener et al., 2018) and even in the clinical setting with users of the Argus II implant (Caspi et al., 2018). From a modeling perspective, it is relatively straightforward to simulate gaze-centered phosphene locations and gaze contingent image processing (our code will be made publicly available). At the same time, however, seen from a clinical and hardware engineering perspective, the implementation of eye-tracking in a prosthetic system for blind individuals might come with additional complexities. This is now acknowledged explicitly in the manuscript.

      Textual adjustment:

      Lines 883-910: Even after loss of vision, the brain integrates eye movements for the localization of visual stimuli (Reuschel et al., 2012), and in cortical prostheses the position of the artificially induced percept will shift along with eye movements (Brindley & Lewin, 1968, Schmidt et al., 1996). Therefore, in prostheses with a head-mounted camera, misalignment between the camera orientation and the pupillary axes can induce localization problems (Caspi et al., 2018; Paraskevoudi & Pezaris, 2019; Sabbah et al., 2014; Schmidt et al., 1996). Previous SPV studies have demonstrated that eye-tracking can be implemented to simulate the gaze-coupled perception of phosphenes (Cha et al., 1992; Sommerhalder et al., 2004; Dagnelie et al., 2006, McIntosh et al., 2013; Paraskevoudi et al., 2021; Rassia et al., 2018; Titchener et al., 2018; Srivastava et al., 2009). Note that some of the cited studies implemented a simulation condition where not only the simulated phosphene locations, but also the stimulation protocol depended on the gaze direction. More specifically, instead of representing the head-centered camera input, the stimulation pattern was chosen to encode the external environment at the location where the gaze was directed. While further research is required, there is some preliminary evidence that such a gaze-contingent image processing can improve the functional and subjective quality of prosthetic vision (Caspi et al., 2018; Paraskevoudi et al., 2021; Rassia et al., 2018; Titchener et al., 2018). Some example videos of gaze-contingent simulated prosthetic vision can be retrieved from our repository (https://github.com/neuralcodinglab/dynaphos/blob/main/examples/). Note that an eye-tracker will be required to produce gaze-contingent image processing in visual prostheses and there might be unforeseen complexities in the clinical implementation thereof. The study of oculomotor behavior in blind individuals (with or without a visual prosthesis) is still an ongoing line of research (Caspi et al.,2018; Kwon et al., 2013; Sabbah et al., 2014; Hafed et al., 2016).

      4) Finally, the computational capacity required to run the described system is substantial and is not one that would plausibly be used as part of an actual device, suggesting that there may be difficulties with converting results from this simulator to an implantable system.

      The software runs in real time with affordable, consumer-grade hardware. In Author response image 1 we present the results of performance testing with a 2016 model MSI GeForce GTX 1080 (priced around €600).

      Author response image 1.

      Note that the GPU is used only for the computation and rendering of the phosphene representations from given electrode stimulation patterns, which will never be part of any prosthetic device. The choice of encoder to generate the stimulation patterns will determine the required processing capacity that needs to be included in the prosthetic system, which is unrelated to the simulator’s requirements.

      The following addition was made to the text:

      • Lines 488-492: Notably, even on a consumer-grade GPU (e.g. a 2016 model GeForce GTX 1080) the simulator still reaches real-time processing speeds (>100 fps) for simulations with 1000 phosphenes at 256x256 resolution.

      5) With all of that said, the results do represent an advance, and one that could have wider impact if the authors were to reduce the computational requirements, and add gaze correction.

      We appreciate the kind compliment from the reviewer and sincerely hope that our revised manuscript meets their expectations. Their feedback has been critical to reshape and improve this work.

    1. Author Response

      Review #1 Public Review:

      This is an interesting study which attempts to assess the effect of the pandemic on diagnoses of pancreatic cancer. The authors have used a large national database to evaluate this, however, it should be noted that this database only captures 40% of the population in England. The authors have looked at specific parameters including Body Mass Index (BMI) as well as markers of diabetes and liver function. Only BMI had a difference in the frequency of measurements during the pandemic, presumably due to reduced face-to-face visits to allow weight and height to be captured.

      Interestingly the authors noticed a reduction in surgery for pancreatic cancer by 25%, yet reported that there were no differences in the frequency of death within 6 months following the diagnosis of pancreatic cancer. The reduction in surgery is likely related at least in part to the loss of operating lists due to pandemic restrictions, however, this paper is not equipped to address another important possibility behind this, which is that pancreatic cancers were presenting too late for surgical intervention. It is not sufficient to comment that pancreatic cancer treatment was not affected by the pandemic based on the data presented on deaths within 6 months of the diagnosis of pancreatic cancer alone, as the median survival of patients diagnosed with pancreatic cancer within the pandemic has not been captured and compared to that of patients diagnosed in the preceding 5 years.

      Therefore while the study can conclude no difference in pancreatic cancer diagnoses before and during the pandemic, more work needs to be done to truly assess if the pandemic had any effect on the outcomes from pancreatic cancer for patients diagnosed within this timeframe.

      Thank you for taking time to undertake the review and for all the constructive comments. This study was designed to assess the effect of the pandemic on pancreatic cancer services in England. We focused on the quantity of healthcare.

      We acknowledge and understand the comments by the reviewer with regards to the limitations of this study in relation to the effect of the COVID-19 pandemic on diagnosis and survival. We did not assess the effect of the pandemic on the staging information and survival length.

    1. Author Response

      Reviewer #1 (Public Review):

      This research aimed to discern the pattern of methylation changes that occur during aging, distinguishing between a unified specific mechanism and stochastic changes. To date, no unified hypothesis exists to guide our understanding of the changes in chromatin geography observed during the aging of cells. This work analysed six different types of purified blood-borne white blood cells allowing comparison across different immune cell subsets to determine if similar patterns occurred in all cell populations. Intriguingly, each subset exhibited its own distinct differential methylation rather than a single program. However, a core set of gene changes close to age-associated CpGs was identified suggesting that a central program existed, but that individual cell type function and metabolism shaped the overall chromatin landscape for the population. These findings establish a new framework for considering the aging process and open new questions about how the individual clocks of different populations might be regulated. While circulating cells are readily accessible for evaluation in humans, the majority of immune cells that regulate immune homeostasis are found within the tissues of the body. Whether these cells exhibit a similar profile to circulating cells or are rather shaped by their tissue or organ-specific ecosystem remains to be determined. In this setting, these tissue-resident cells are exposed to very different oxygen tensions and metabolic substrates. Furthermore, genes identified have been associated with aging, they concurrently appear to be associated with inflammation, thus it is not clear whether aging and low-grade inflammation are inherently linked, or whether these two pathways can be segregated. Thus a number of questions remain warranting further investigation.

      The reviewer makes a very good point regarding different tissue resident cells being exposed to different oxygen and metabolic stress. In the reviewed manuscript we have Arid3a coming up as one of the transcription factors with motifs in and around probes hypermethylated with age in monocytes. Arid3a is known to target inflammatory genes but future research is warranted to implicate the link between aging and low-grade inflammation. To address the comment about connection between aging and low-grade inflammation, in the revised manuscript, we have incorporated new analysis by looking into SomaScan array derived protein levels of seven cytokines from the same cohort of donors. We tested the hypothesis that part of the age-associated changes in DNA methylation are connected with the well-known age-related proinflammatory state. We have now added the details in the Results and Methods sections. Briefly, we run two regression models (CpGi~age+sex and CpGi~age+sex+analytej, where i is each CpG probe from EPIC array and j is each of the seven cytokines). We find that change in DNA methylation levels in nearly 70009000 CpG sites in CD4 cells and 124 CpG sites in B cells that were originally age-associated, also are associated with increasing levels of TNFRSF1A, TNFRSF1B and TNF-alpha levels thereby indicating a link between DNA methylation change and aging as well as inflammatory cytokines levels.

    1. Author Response

      Reviewer #1 (Public Review):

      The authors convincingly show in this study the effects of the fas5 gene on changes in the CHC profile and the importance of these changes toward sexual attractiveness.

      The main strength of this study lies in its holistic approach (from genes to behaviour) showing a full and convincing picture of the stated conclusions. The authors succeeded in putting a very interdisciplinary set of experiments together to support the main claims of this manuscript.

      We appreciate the kind comments from the reviewer.

      The main weakness stems from the lack of transparency behind the statistical analyses conducted in the study. Detailed statistical results are never mentioned in the text, nor is it always clear what was compared to what. I also believe that some tests that were conducted are not adequate for the given data. I am therefore unable to properly assess the significance of the results from the presented information. Nevertheless, the graphical representations are convincing enough for me to believe that a revision of the statistics would not significantly affect the main conclusions of this manuscript.

      We apologize for neglecting a detailed description of statistical tests that were performed. We wrote additional paragraphs in the method part specifically explaining the statistical analyses (line 435-445; 489-502; 559-561; 586-591).

      The second major problem I had with the study was how it brushes over the somewhat contradicting results they found in males (Fig S2). These are only mentioned twice in the main text and in both cases as being "similarly affected", even though their own stats seem to indicate otherwise for many of the analysed compound groups. This also should affect the main conclusion concerning the effects of fas5 genes in the discussion, a more careful wording when interpreting the results is therefore necessary.

      Thank you for pointing this out. Though our focus clearly lay on the female CHC profiles as a function in sexual signaling has only been described thus far for them, we now elaborated the result and discussion for the fas5 RNAi male part (line 167-178; 258-268).

      Reviewer #2 (Public Review):

      Insects have long been known to use cuticular hydrocarbons for communication. While the general pathways for hydrocarbon synthesis have been worked out, their specificity and in particular the specificity of the different enzymes involved is surprisingly little understood. Here, the authors convincingly demonstrate that a single fatty acid synthase gene is responsible for a shift in the positions of methyl groups across the entire alkane spectrum of a wasp, and that the wasps males recognize females specifically based on these methyl group positions. The strength of the study is the combination of gene expression manipulations with behavioural observations evaluating the effect of the associated changes in the cuticular hydrocarbon profiles. The authors make sure that the behavioural effect is indeed due to the chemical changes by not only testing life animals, but also dead animals and corpses with manipulated cuticular hydrocarbons.

      I find the evidence that the hydrocarbon changes do not affect survival and desiccation resistance less convincing (due to the limited set of conditions and relatively small sample size), but the data presented are certainly congruent with the idea that the methyl alkane changes do not have large effects on desiccation.

      We appreciate the kind comments from the reviewer.

      Reviewer #3 (Public Review):

      In this manuscript, the authors are aiming to demonstrate that a fatty-acyl synthase gene (fas5) is involved in the composition of the blend of surface hydrocarbons of a parasitoid wasp and that it affects the sexual attractiveness of females for males. Overall, the manuscript reads very well, it is very streamlined, and the authors' claims are mostly supported by their experiments and observations.

      We appreciate the kind comments from the reviewer.

      However, I find that some experiments, information and/or discussion are absent to assess how the effects they observe are, at least in part, not due to other factors than fas5 and the methyl-branched (MB) alkanes. I'm also wondering if what the authors observe is only a change in the sexual attractiveness of females and not related to species recognition as well.

      We appreciate the interesting point that the reviewer raises in sexual attractiveness and species recognition and now expand upon this potential aspect in the discussion (lines 327-330). However, in this manuscript, we very much focused on the effect of fas5 knockdown on the conveyance of female sexual attractiveness in a single species (Nasonia vitripennis). Therefore, we argue that species recognition constitutes a different communication modality here, and we currently cannot infer whether and how species recognition is exactly encoded in Nasonia CHC profiles despite some circumstantial evidence for species-specificity (Buellesbach et al. 2013; Mair et al. 2017). Thus, we would like to refrain from any further speculation on species recognition before this can be unambiguously demonstrated, and remain within the mechanism of sexual attractiveness within a single species which we clearly show is mediated by the female MB-alkane fraction governed by the fatty acid synthase genes. We however still consider potential alternative explanations (e.g., n-alkenes acting as a deterrent of homosexual mating attempts).

      The authors explore the function of cuticular hydrocarbons (CHCs) and a fatty-acyl synthase in Nasonia vitripennis, a parasitic wasp. Using RNAi, they successfully knockdown the expression of the fas5 gene in wasps. The authors do not justify their choice of fatty-acyl synthase candidate gene. It would have been interesting to know if that is one of many genes they studied or if there was some evidence that drove them to focus their interest in fas5.

      In a previous study, 5 fas candidate genes orthologous to Drosophila melanogaster fas genes were identified and mapped in the genome of Nasonia vitripennis (Buellesbach et al. 2022). We actually investigated the effects of all of these fas genes on CHC variation, but only fas5 led to such a striking, traceable pattern shift. We are currently preparing another manuscript discussing the effects of the other fas genes, but decided to focus exclusively on fas5 here, due to its significance for revealing how sexual attractiveness can be encoded and conveyed in complex chemical profiles, maintained and governed by a surprisingly simple genetic basis.

      The authors observe large changes in the cuticular hydrocarbons (CHC) profile of male and females. These changes are mostly a reduction of some MB alkanes and an increase in others as well as an increase of n-alkene in fas5 knockdown females. For males fas5 knockdowns, the overall quantity of CHC is increased and consequently, multiple types of compounds are increased compared to wild-type, with only one compound appearing to decrease compared to wild-type. Insects are known to rely on ratios of compounds in blends to recognize odors. Authors address this by showing a plot of the relative ratios, but it seems to me that they do show statistical tests of those changes in the proportions of the different types of compounds. In the results section, the authors give percentages while referring to figures showing the absolute amount of CHCs. They should also test if the ratios are significantly different or not between experimental conditions. Similar data should be displayed for the males as well.

      We appreciate your suggestions. We kindly refer you to our response to reviewer 1, where we addressed the statistical tests. Specifically, we generated separate subplots to display the proportions of different compound classes and performed statistical tests to compare these proportions between different treatments for both males and females. Additionally, we have revised the results section to replace relative abundances with absolute quantity, as depicted in Figure 2C-G.

      Furthermore, the authors didn't use an internal standard to measure the quantity of CHCs in the extracts, which, to me, is the gold standard in the field. If I understood correctly, the authors check the abundance measured for known quantities of n-alkanes. I'm sure this method is fine, but I would have liked to be reassured that the quantities measured through this method are good by either testing some samples with an internal standard, or referring to work that demonstrates that this method is always accurate to assess the quantities of CHC in extracts of known volumes.

      We actually did include 7,5 ng/μl dodecane (C12) as an “internal” standard in the hexane resuspensions of all of our processed samples (line 456, Materials and Methods). This was primarily done to allow for visually inspecting and comparing the congruence of all chromatograms in the subsequent data analysis and immediately detect any variation from sample preparation, injection process and instrument fluctuation. In our study, we have a very elaborate and standardized CHC extraction method that the volume of solvent and duration for extraction are strictly controlled to minimize the variation from sample preparation steps. Furthermore, we calibrated each individual CHC compound quantity with a dilution series of external standards (C21-C40) of known concentration. By constructing a calibration curve based on this dilution series, we achieved the most accurate compound quantification, also taking into account and counteracting the generally diminishing quantities of compounds with higher chain lengths.

      The authors provide a sensible control for their RNAi experiments: targeting an unrelated gene, absent in N. vitripennis (the GFP). This allows us to see if the injection of RNAi might affect CHC profiles, which it appears to do in some cases in males, but not in females. The authors also show to the reader that their RNAi experiments do reduce the expression of the target gene. However, one of the caveats of their experiments, is that the authors don't provide evidence or information to allow the (non-expert) reader to assess whether the fas5 RNAi experiments did affect the expression of other fatty-acyl synthase genes. I'm not an expert in RNAi, so maybe this suggestion is not relevant, but it should, at least, be addressed somewhere in the manuscript that such off-target effects are very unlikely or impossible, in that case, or more generally.

      We acknowledge the reviewer’s concern about potential off-target effect of the fas5 knockdown. We actually did check initially for off-target effects on the other four previously published fas genes in N. vitripennis (Lammers et al. 2019; Buellesbach et al. 2022) and did not find any effects on their respective expressions. We now include these results as supplementary data (Figure 2-figure supplement 1). However, as mentioned in the cover letter to the editor, we discovered a previously uncharacterized fas gene in the most recent N. vitripennis genome assembly (NC_045761.1), fas6, most likely constituting a tandem gene duplication of fas5. These two genes turned out to have such high sequence similarity (> 90 %, Figure 2-figure supplement 2) that both were simultaneously downregulated by our fas5 dsRNAi construct, which we confirmed with qPCR and now incorporated into our manuscript (Fig. 2H). Therefore, we now explicitly mention that the knockdown affects both genes, and either one or both could have the observed phenotypic effects. Recognizing this RNAi off-target effect, we have now also incorporated a discussion of this issue in the appropriate section of the manuscript (line 364-377), as well as the potential off-target effects of our GFP dsRNAi controls (line 262-274).

      The authors observe that the modified CHCs profiles of RNAi females reduce courtship and copulation attempts, but not antennation, by males toward live and (dead) dummy females. They show that the MB alkanes of the CHC profile are sufficient to elicit sexual behaviors from males towards dummy females and that the same fraction from extracts of fas5 knockdown females does so significantly less. From the previous data, it seems that dummy females with fas5 female's MB alkanes profile elicit more antennation than CHC-cleared dummy females, but the authors do not display data for this type of target on the figure for MB alkane behavioral experiments.

      Actually similar proportions of males performed antennation behavior towards female dummies with MB alkane fraction of fas5 RNAi females and CHC-cleared female dummies (55% and 50%, respectively, see Author response image 1 for the corresponding parts of the sub-figures 3 E and 4 D). We did not deem it necessary to show the same data on CHC-cleared female dummies in Figure 3 as well.

      Author response image 1.

      Unfortunately, the authors don't present experiments testing the effect of the non-MB alkanes fractions of the CHC extracts on male behavior toward females. As such, they are not able to (and didn't) conclude that the MB-alkane is necessary to trigger the sexual behaviors of males. I believe testing this would have significantly enhanced the significance of this work. I would also have found it interesting for the authors to comment on whether they observe aggressive behavior of males towards females (live or dead) and/or whether such behavior is expected or not in inter-individual interactions in parasitoids wasps.

      In our experiment, we focus on the function of the MB-alkane fraction in female CHC profiles, and we comprehensibly demonstrate in figure 4 that the MB-alkane fraction from WT females alone is sufficient to trigger mating behavior coherent with that on alive and untreated female dummies. Therefore, we do not completely understand the reviewer’s concern about us not being ” able to (and didn't) conclude that the MB-alkane is necessary to trigger the sexual behaviors of males”. We appreciate the suggestion from the reviewer of testing the non-MB alkanes (n-alkanes and n-alkenes). However, due to the experimental procedure of separating the CHC compound class fractions through elution with molecular sieves, it was not possible for us to retrieve either the whole n-alkane or n-alkene fraction remaining bound to the sieves after separation). The role of n-alkenes in N. vitripennis is however considered in the discussion, as a deterrent for homosexual interactions between males (Wang et al. 2022a). Moreover, we did not observe aggressive behavior of males towards live or dead females.

      CHCs are used by insects to signal and/or recognize various traits of targets of interest, including species or groups of origin, fertility, etc. The authors claim that their experiments show the sexual attractiveness of females can be encoded in the specific ratio of MB alkanes. While I understand how they come to this conclusion, I am somewhat concerned. The authors very quickly discuss their results in light of the literature about the role of CHCs (and notably MB alkanes) in various recognition behaviors in Hymenoptera, including conspecific recognition. Previous work (cited by the authors) has shown that males recognize males from females using an alkene (Z9C31). As such, it remains possible that the "sexual attractiveness" of N. vitripennis females for males relies on them not being males and being from the right species as well. The authors do not address the question of whether the CHCs (and the MB alkanes in particular) of females signal their sex or their species. While I acknowledge that responding to this question is beyond the scope of this work, I also strongly believe that it should be discussed in the manuscript. Otherwise, non-specialist readers would not be able to understand what I believe is one of the points that could temper the conclusions from this work.

      We acknowledge the reviewer’s insight about the MB alkanes in signaling sex or species in N. vitripennis, and now include this aspect in our revised discussion (line 324-330). Moreover, we clearly demonstrate that n-alkenes have been reduced to minute trace components after our compound class separation, and the males still do not display courtship and copulation behaviors similar to WT females, thus strongly indicating that the n-alkenes do not play a role when relying solely on the changed MB-alkane patterns, further strengthening our main argument.

      References

      Benjamini, Y. and D. Yekutieli. 2001. The control of the false discovery rate in multiple testing under dependency. Ann. Stat. 29:1165-1188.

      Buellesbach, J., J. Gadau, L. W. Beukeboom, F. Echinger, R. Raychoudhury, J. H. Werren, and T. Schmitt. 2013. Cuticular hydrocarbon divergence in the jewel wasp Nasonia: Evolutionary shifts in chemical communication channels? J. Evol. Biol. 26:2467-2478.

      Buellesbach, J., C. Greim, and T. Schmitt. 2014. Asymmetric interspecific mating behavior reflects incomplete prezygotic isolation in the jewel wasp genus Nasonia. Ethology 120:834-843.

      Buellesbach, J., H. Holze, L. Schrader, J. Liebig, T. Schmitt, J. Gadau, and O. Niehuis. 2022. Genetic and genomic architecture of species-specific cuticular hydrocarbon variation in parasitoid wasps. Proc. R. Soc. B 289:20220336.

      Engl, T., N. Eberl, C. Gorse, T. Krüger, T. H. P. Schmidt, R. Plarre, C. Adler, and M. Kaltenpoth. 2018. Ancient symbiosis confers desiccation resistance to stored grain pest beetles. Mol. Ecol. 27:2095-2108.

      Ferveur, J. F., J. Cortot, K. Rihani, M. Cobb, and C. Everaerts. 2018. Desiccation resistance: effect of cuticular hydrocarbons and water content in Drosophila melanogaster adults. Peerj 6.

      Lammers, M., K. Kraaijeveld, J. Mariën, and J. Ellers. 2019. Gene expression changes associated with the evolutionary loss of a metabolic trait: lack of lipogenesis in parasitoids. BMC Genom. 20:309.

      Mair, M. M., V. Kmezic, S. Huber, B. A. Pannebakker, and J. Ruther. 2017. The chemical basis of mate recognition in two parasitoid wasp species of the genus Nasonia. Entomol. Exp. Appl. 164:1-15.

      Wang, Y., W. Sun, S. Fleischmann, J. G. Millar, J. Ruther, and E. C. Verhulst. 2022a. Silencing Doublesex expression triggers three-level pheromonal feminization in Nasonia vitripennis males. Proc. R. Soc. B 289:20212002.

      Wang, Z., J. P. Receveur, J. Pu, H. Cong, C. Richards, M. Liang, and H. Chung. 2022b. Desiccation resistance differences in Drosophila species can be largely explained by variations in cuticular hydrocarbons. eLife 11:e80859.

    1. Author Response

      Reviewer #1 (Public Review):

      The work described herein would have an impact on the field in multiple ways. Firstly, it demonstrates a novel metabolic role for MSH in the regulation of hepatic cholesterol metabolism. This may prove to be a viable therapeutic strategy for the treatment of dyslipidemia. Furthermore, the authors demonstrate an alternative signaling cascade elicited by MSH independent of cAMP, but rather relying on AMPK. This novel interaction between AMPK and MC1R could have more widespread implications beyond the control of hepatic cholesterol metabolism.

      For the most part, the conclusions offered by the authors are supported by the data that is presented. There are, however, a number of concerns in the current version of this manuscript detailed below.

      We thank the reviewer for the encouraging and insightful comments, and we are pleased to read that the manuscript has raised considerable interest.

      1) The authors demonstrate the expression of MC1R in hepatocytes through IHC staining and western blot analysis. Furthermore, the authors show an alteration in systemic bile acid homeostasis in MC1R KO mice. However, no mention of MC1R expression or function in cholangiocytes is discussed. This is important to assess both experimentally and within the discussion given the profound role of the biliary epithelium in modulating bile acid homeostasis. Furthermore, in figure 1 the authors validate the MC1R knockdown only through mRNA expression. Given panels A and C of figure 1 shows there is clearly a functional antibody for MC1R, validation of protein knockdown is needed.

      The reviewer raises an important point, which we addressed by performing immunofluorescence staining using an antibody against the cholangiocyte marker cytokeratin 19 (CK-19). These colocalization studies demonstrate the presence of MC1-R in CK19-positive cholangiocytes (Figure 1-figure supplement 1). Furthermore, we have now added a discussion on the possible role of MC1-R in modulating bile acid homestasis in cholangiocytes (page 12, lines 456-462).<br /> We also quantified MC1-R protein expression by Western blotting in the liver of LMc1r-/- mice. MC1-R protein level was significantly reduced in L-Mc1r-/- mice compared to L-Mc1+/- mice (Figure 2-figure supplement 2).

      2) Figure 2 demonstrates a steatotic effect of MC1R knockdown in hepatocytes. The authors attempt to provide mechanistic insight into this phenomenon through assessing the mRNA expression of genes involved in cholesterol and fatty acid synthesis. The data provided is modest at the gene level and no protein validation was provided to demonstrate functional alterations of these proteins in MC1R KO mice. Key proteins proposed such as SREBP2 and HMGCR need to be validated via a western blot of IHC analysis.

      As requested by the reviewer, we quantified the expression of key proteins in the liver of L-Mc1r-/- mice by Western blotting. We observed that the protein levels of HMGCR and DHCR7 as well as the ratio between the mature and precursor forms of SREBP2 were reduced in L-Mc1r-/- mice (Figure 2F-H, page 6/lines 182-191 & page 10-11/lines 390-401). This is likely a result of the feedback regulation, whereby cholesterol accumulation suppresses the cleavage of SREBP2 and leads to a consequent downregulation of the key cholesterol synthesis enzymes such as HMGCR and DHCR7 (Brown S & Goldstein JL, Cell. 1997 May 2;89(3):331-40).

      We discussed in the original submission (page 11) as follows: ‘In the presence of excess cellular cholesterol, transcriptional induction and posttranslational activation of SREBP-2 should be attenuated, which in turn downregulates Hmgcr and Dhcr7 and reduces cholesterol synthesis as a counterregulatory mechanism. Therefore, given the increase in hepatic cholesterol content, it was unexpected that Srebp2 expression was upregulated in the liver of L-Mc1r-/- mice’. The finding of reduced SREBP2/HMGCR protein expression is thus more logical, but admittedly, it is discordant with increased Srebp2/Hmgcr mRNA expression (as reported in the original submission), which might be a compensatory response to suppressed SREBP2 cleavage. Taking into account that activation of MC1-R did not affect the protein expression of HMGCR or DHCR7 in HepG2 cells, it is plausible that hepatic cholesterol accumulation in L-Mc1r-/- mice is driven by a defect in bile acid metabolism, rather than by a direct effect of MC1-R signaling on cholesterol synthesis. To avoid unnecessary confusion, we decided to omit the qPCR data and related text parts from the manuscript and report the protein expression data instead.

      4) The authors suggest the involvement of AMPK in mediating the cholesterol-lowering effects of MSH. However, MSH is still able to lower free cholesterol levels even in the presence of an AMPK inhibitor. This suggests that MSH does not in fact rely on the activation of AMPK to elicit these cholesterol-lowering effects. The authors' conclusions are stronger than the actual data support. Furthermore, the authors claim LD211 phenocopies the effects of MSH in the presence of an AMPK inhibitor. However, the authors only measured the phosphorylation of Akt as their outcome. This begs the question, does LD211 still lower total cholesterol in the presence of AMPK inhibitors? This experiment is essential to conclude whether or not LD211 phenocopies the effects of MSH.

      The reviewer may have missed that we postulate in the manuscript that ‘MC1-R activation engages multiple signaling mechanisms to regulate cholesterol metabolism in HepG2 cells’ (manuscript page 8, lines 310-311 & page 13, lines 498508), since low concentration of a-MSH was still able to lower free cholesterol level in the presence of the AMPK inhibitor dorsomorphin. We have been careful not to claim that the effects of a-MSH are solely dependent on AMPK phosphorylation. Likewise, we have not claimed in the original submission that LD211 phenocopies the effects of MSH in the presence of an AMPK inhibitor. However, as suggested by the reviewer, we performed new experiments to investigate the effects of LD211 on cellular cholesterol levels in the absence and presence of dorsomorphin. We found that AMPK inhibition with dorsomorphin completely abolished the cholesterollowering effect of LD211 (Figure 7-figure supplement 2), which might indicate that this synthetic agonist has a stronger signaling bias toward the AMPK pathway compared to α-MSH.

      5) The authors initiate the project by showing high-fat diet disrupts the expression of MC1R. However, all of the subsequent experiments in hepatic MC1R KO mice are performed under normal chow. This begs the question of what is the phenotype of the hepatic MC1R KO mice fed a high-fat diet. Does KO of MC1R in the liver exacerbate HFD-induced obesity, glucose intolerance, and dyslipidemia? Inversely, can WT mice challenged with an HFD be rescued metabolically by treatment with either MSH or LD211? Providing data along these lines of investigation will provide physiological/clinical relevance to their findings.

      As suggested by the reviewer, we phenotyped the hepatic MC1R KO (LMc1r-/-) mice after feeding them a cholesterol- and fat-rich Western diet for 12 weeks (RD Western Diet, D12079B, Research Diets Inc, NJ, USA). This was exactly the same dietary regimen (product and duration) that was used to study the changes in hepatic MC1-R expression in wild-type C57Bl mice (Figure 1B&C). We observed that 12-week Western diet feeding induced a significant gain in body weight and total fat mass as well as an increase in plasma and hepatic cholesterol and TG levels (Figure 2-figure supplement 2). L-Mc1r-/- mice did not show a difference in body weight gain, but the weight gain was attributable to enhanced gain in fat mass and a blunted increase in lean mass compared to control Mc1rfl/fl mice (Figure 2-figure supplement 2A, D & E). Furthermore, liver weight and plasma cholesterol and TG concentrations were unchanged in HFD-fed L-Mc1r-/- mice (Figure 2-figure supplement 2B, C, F & G). Importantly, recapitulating the phenotype observed in chow-fed mice, hepatic cholesterol and TG content was significantly increased in LMc1r-/- mice after a HFD challenge (Figure 2-figure supplement 2H & I). Taken together, it appears that the phenotype of HFD-fed L-Mc1r-/- mice was slightly diluted compared to the phenotype observed in chow-fed L-Mc1r-/- mice. This phenotypic difference might relate to the finding that Western diet feeding reduced the hepatic expression of MC1-R, thus limiting the incremental effect of genetically induced MC1-R deficiency on hypercholesterolemia and hepatic lipid accumulation.

      We have previously studied the effects of pharmacological MC1-R activation in Western diet-fed mice and observed that chronic treatment with a selective MC1-R agonist reduced plasma cholesterol level and upregulated hepatic Ldlr expression without affecting body weight gain (Rinne P et al, Circulation. 2017 Jul 4;136(1):8397.). These findings are also discussed on manuscript page 12, lines 475-478. Although the selective MC1-R agonist was different in that particular study, it is expected that LD211 would also elicit a similar cholesterol-lowering effect in Western diet-fed mice. Chronic treatment with a-MSH, on the other hand, would likely produce wide-ranging metabolic effects. In addition to MC1-R activation in hepatocytes and its consequent effect on liver cholesterol metabolism, a-MSH would affect feeding, energy expenditure and cholesterol metabolism via MC4-R activation in the central nervous system as well as fatty acid and glucose metabolism via MC5-R activation in the skeletal muscle. Therefore, the phenotype associated with a-MSH treatment would be complex and mediated by multiple mechanisms and MC-R subtypes, thus making it difficult to interpret the exact contribution of hepatic MC1-R signaling to the observed phenotype.

      Reviewer #2 (Public Review):

      Keshav Thapa et al. investigated the role of melanocortin 1 receptor (MC1-R) in cholesterol and bile acid metabolism in the liver. First, they observed that MC1-R is present in the mouse liver and that its expression is reduced in response to a cholesterolrich diet. To determine the role of MC1-R in the liver, they generated hepatocyte-specific MC1-R KO mice (L-Mc1r-/-). These animals exhibited a significant increase in liver weight, lipid accumulation, triglycerides and cholesterol levels, and fibrosis in comparison with control mice. By performing liquid chromatography-mass spectrometry, the authors also found that L-Mc1r-/- mice also have fewer bile acids in the plasma and faeces, but not in the liver. In accordance with these findings, mRNA/protein expression of different genes involved in these processes were altered in L-Mc1r-/- animals.

      Secondly, in an attempt to evaluate the underlying mechanisms, they measured the expression of MC1-R in HepG2 cells under different treatments (i.e., palmitic acid, LDL, and atorvastatin). Moreover, they stimulated these cells with the endogenous MC1-R agonist - MSH, where they show that this molecule decreases the free cholesterol content, whereas increasing LDL and HDL uptake, as well as recapitulates some previously observed phenotypes in the proportions of bile acids. These effects were also encountered when using a selective agonist for MC1-R (i.e., LD211), further supporting the specific role of MC1-R. Finally, some experiments indicated that -MSH evokes not one single, but multiple intracellular signalling cascades for which MC1-R activation effects might take place.

      Overall, this work provides novel and interesting findings on the role of MC1-R in cholesterol and bile acid metabolism in the liver, which undoubtedly will have some crucial implications for future research. Nevertheless, some experimental details should be better explained for the correct interpretation of the data. Besides, discrepant results exist regarding the molecular mechanisms behind MC1-R action that requires additional experimentation to support the conclusions drawn.

      We thank the reviewer for the encouraging and insightful comments, and we are pleased to read that the manuscript has raised considerable interest.

    1. Author Response

      Reviewer #1 (Public Review):

      The authors aim to understand the role of clonal heterogeneity of tumors in immunogenicity of clonally expressed antigens. This is a significant problem with many basic as well as translational implications.

      The strength of the manuscript lies in the novel demonstration that a poorly immunogenic tumor antigen, when paired with a stronger tumor antigen, begins to elicit significant immune response. The weakness lies in the fact that the actual mechanism of the key demonstration is never shown. There is a lot of speculation and tangential experimentation, but little actual evidence of a mechanism.

      By making the key observation (mentioned in the strength section in the previous paragraph), the authors did achieve their objective albeit very partially. Their observation is based on excellent experimental tools and design. This study will stimulate further experiments in this important field.

      Their key observation is somewhat reminiscent of the practice of conjugating small "non-immunogenic" antigens (such as some carbohydrates) to large protein carriers (such as serum albumin) in order to elicit strong antibody response to the weaker antigen. It is interesting to contemplate if the underlying mechanisms have any commonality.

      We thank the reviewer for their consideration of our work and their constructive feedback. We concur that our study has limitations and further work will be necessary to fully deconstruct the mechanism leading to the observed phenotype. We have revised the text to better reflect the aim and scope of our study. However, the goal of our work was to establish a trackable model that would allow us to model different, albeit limited, degrees of antigen expression patterns reflecting what is observed in patients with different levels of ITH. Our key observation reproduces what is observed clinically, adding strength to the model. Next, we wanted to study what was different about the induced immune responses to develop strategies to better treat tumors with heterogeneous NeoAg expression patterns that currently do not respond to checkpoint blockade therapy. Studying KP-HetHigh and KP-HetLow tumors revealed that tumor debris-carrying cDC1 draining from KP-HetLow tumors phagocytosed both NeoAgs. This population of cDC1, carrying both NeoAgs, had a more stimulatory phenotype compared to cDC1 without tumor debris or cDC1 that had engulfed only one NeoAg. We were able to develop a targeted therapy including CD40 agonism based on our key observations: KP-HetLow had a more robust response towards the weaker NeoAg which was associated with more stimulatory cDC1 presenting both NeoAgs compared to KP-HetHigh tumors. The stronger immune response increased responsiveness to CBT.

      The reviewer makes an interesting point about conjugate vaccines, which canonically elicit greater responses because they engage multiple immune cells, namely T cells with B cells, resulting in stronger antibody responses. The prevalence of tumor debris-carrying cDC1 with both neoantigens in KP-HetLow does make us consider that this population of cDC1 may be engaging multiple immune populations, i.e., different neoantigen-specific T cells. We suggest this as a possible mechanism for greater Aatf responses, but further work is necessary to determine if the same cDC1 can directly interact with both neoantigen-specific T cells.

      Reviewer #2 (Public Review):

      There are data to suggest that intratumour mutational heterogeneity (ITH; the proportion of all mutations that are found only within cancer subclones) is associated with worse therapeutic outcomes. Specifically, patients with more mutations (and thus neoantigens) mostly expressed by subclones (high ITH) have poorer responses to checkpoint immunotherapy. The authors set out to explore the mechanisms underlying this by studying 2 dimensions of neoantigen biology: firstly, distribution (clonal vs subclonal) and secondly, immunogenicity (weak vs strong binding to MHC class I). Using a panel of lung cancer cell lines modified to express individual or dual neoantigens in order to model clonal and subclonal expression, elegant studies show that clonal co-expression with a "strong" neoantigen can boost the immunogenicity of a "weak" neoantigen and result in tumour control. Mechanistically, this is related to engulfment of both neoantigens by cross presenting type 1 conventional dendritic cells and the associated enhanced activation state of this cell type. This is an interesting and potentially important finding that may be related to mechanisms of epitope spreading as immune responses diverge from targeting more to less immunogenic epitopes. Overall, the study is thought-provoking, informative in relation to how neoantigen immunogenicity is shaped and may have practical relevance.

      We greatly appreciate the constructive comments from the reviewer and their insightful comments and questions on our work. We have edited the text in response to their feedback. We believe these changes have made the writing clearer and more effectively communicates the scope of our study and our results to the reader.

    2. Reviewer #2 (Public Review):

      There are data to suggest that intratumour mutational heterogeneity (ITH; the proportion of all mutations that are found only within cancer subclones) is associated with worse therapeutic outcomes. Specifically, patients with more mutations (and thus neoantigens) mostly expressed by subclones (high ITH) have poorer responses to checkpoint immunotherapy. The authors set out to explore the mechanisms underlying this by studying 2 dimensions of neoantigen biology: firstly, distribution (clonal vs subclonal) and secondly, immunogenicity (weak vs strong binding to MHC class I). Using a panel of lung cancer cell lines modified to express individual or dual neoantigens in order to model clonal and subclonal expression, elegant studies show that clonal co-expression with a "strong" neoantigen can boost the immunogenicity of a "weak" neoantigen and result in tumour control. Mechanistically, this is related to engulfment of both neoantigens by cross presenting type 1 conventional dendritic cells and the associated enhanced activation state of this cell type. This is an interesting and potentially important finding that may be related to mechanisms of epitope spreading as immune responses diverge from targeting more to less immunogenic epitopes. Overall, the study is thought-provoking, informative in relation to how neoantigen immunogenicity is shaped and may have practical relevance.

    1. Reviewer #1 (Public Review):

      Zhou et al. have set up a study to examine how metabolism is regulated across the organism by taking a combined approach looking at gene expression in multiple tissues, as well as analysis of the blood. Specifically, they have created a tool for easily analyzing data from GTEx across 18 tissues in 310 people. In principle, this approach should be expandable to any dataset where multiple tissues of data were collected from the same individuals. While not necessary, it would also raise my interest to see the "Mouse(coming soon)" selection functional, given that the authors have good access to multi-tissue transcriptomics done in similarly large mouse cohorts.

      Summary:

      The authors have assembled a web tool that helps analyze multiple tissues' datasets together, with the aim of identifying how metabolic pathways and gene regulation are connected across tissues. This makes sense conceptually and the web tool is easy to use and runs reasonably quickly, considering the size of the data. I like the tool and I think the approach is necessary and surprisingly under-served; there is a lot of focus on multi-omics recently, but much less on doing a good job of integrating multi-tissue datasets even within a single omics layer.

      What I am less convinced about is the "Research Article" aspect of this paper. Studying circadian rhythm in GTEx data seems risky to me, given the huge range in circadian clock in the sample collection. I also wonder (although this is not even remotely in my expertise) whether the circadian rhythm also gets rather desynchronized in people dying of natural causes - although I suppose this could be said for any gene expression pathway. Similarly for looking at secreted proteins in Figure 4 looking at muscle-hippocampus transcript levels for ADAMTS17 doesn't make sense to me - of all tissue pairs to make a vignette about to demonstrate the method, this is not an intuitive choice to me. The "within muscle" results look fine but panels C-E-G look like noise to me...especially panel C and G are almost certainly noise, since those are pathways with gene counts of 2 and 1 respectively.

      I think this is an important effort and a good basis but a significant revision is necessary. This can devote more time and space to explaining the methodology and for ensuring that the results shown are actually significant. This could be done by checking a mix of negative controls (e.g. by shuffling gene labels and data) and a more comprehensive look at "positive" genes, so that it can be clearly shown that the genes shown in Fig 1 and 2 are not cherry-picked. For Figure 3, I suspect you would get almost an identical figure if instead of showing pan-tissue circadian clock correlations, you instead selected the electron transport chain, or the ribosome, or any other pathway that has genes that are expressed across all tissues. You show that colon and heart have relatively high connectivity to other tissues, but this may be common to other pathways as well.

    2. Reviewer #2 (Public Review):

      Summary:

      Zhou et al. use publicly available GTEx data of 18 metabolic tissues from 310 individuals to explore gene expression correlation patterns within-tissue and across-tissues. They detect signatures of known metabolic signaling biology, such as ADIPOQ's role in fatty acid metabolism in adipose tissue. They also emphasize that their approach can help generate new hypotheses, such as the colon playing an important role in circadian clock maintenance. To aid researchers in querying their own genes of interest in metabolic tissues, they have developed an easy-to-use webtool (GD-CAT).

      This study makes reasonable conclusions from its data, and the webtool would be useful to researchers focused on metabolic signaling. However, some misconceptions need to be corrected, as well as greater clarification of the methodology used.

      Strengths:

      GTEx is a very powerful resource for many areas of biomedicine, and this study represents a valid use of gene co-expression network methodology. The authors do a good job of providing examples confirming known signaling biology as well as the potential to discover promising signatures of novel biology for follow-up and future studies. The webtool, GD-CAT, is easy to use and allows researchers with genes and tissues of interest to perform the same analyses in the same GTEx data.

      Weaknesses:

      A key weakness of the paper is that this study does not involve genetic correlations, which is used in the title and throughout the manuscript, but rather gene co-expression networks. The authors do mention the classic limitation that correlation does not imply causation, but this caveat is even more important given that these are not genetic correlations. Given that the goal of their study aligns closely with multi-tissue WGCNA, which is not a new idea (e.g., Talukdar et al. 2016; https://doi.org/10.1016/j.cels.2016.02.002), it is surprising that the authors only use WGCNA for its robust correlation estimation (bicor), but not its latent factor/module estimation, which could potentially capture cross-tissue signaling patterns. It is possible that the biological signals of interest would be drowned out by all the other variation in the data but given that this is a conventional step in WGCNA, it is a weakness that the authors do not use it or discuss it.

    3. Reviewer #3 (Public Review):

      Summary:

      A useful and potentially powerful analysis of gene expression correlations across major organ and tissue systems that exploits a subset of 310 humans from the GTEx collection (subjects for whom there are uniformly processed postmortem RNA-seq data for 18 tissues or organs). The analysis is complemented by a Shiny R application web service.

      The need for more multisystems analysis of transcript correlation is very well motivated by the authors. Their work should be contrasted with more simple comparisons of correlation structure within different organs and tissues, rather than actual correlations across organs and tissues.

      Strengths and Weaknesses:

      The strengths and limitations of this work trace back to the nature of the GTEx data set itself. The authors refer to the correlations of transcripts as "gene" and "genetic" correlations throughout. In fact, they name their web service "Genetically-Derived Correlations Across Tissues". But all GTEx subjects had strong exposure to unique environments and all correlations will be driven by developmental and environmental factors, age, sex differences, and shared and unshared pre- and postmortem technical artifacts. In fact we know that the heritability of transcript levels is generally low, often well under 25%, even studies of animals with tight environmental control.

      This criticism does not comment materially detract for the importance and utility of the correlations-whether genetic, GXE, or purely environmental-but it does mean that the authors should ideally restructure and reword text so as to NOT claim so much for "genetics". It may be possible to incorporate estimates of chip heritability of transcripts into this work if the genetic component of correlations is regarded as critical (all GTEx cases have genotypes).

      Appraisal of Work on the Field:

      There are two parts to this paper: 1. "case studies" of cross-tissue/organ correlations and 2. the creation of an R/Shiny application to make this type of analysis much more practical for any biologist. Both parts of the work are of high potential value, but neither is fully developed. My own opinion is that the R/Shiny component is the more important immediate contribution and that the "case studies" could be placed in the context of a more complete primer. Or Alternatively, the case studies could be their own independent contributions with more validation.

    1. Reviewer #3 (Public Review):

      In this paper, the authors analyze a large previously published deep mutational scanning data set using a reference-free regression approach. They extract the contributions of single locus and epistatic effects to the functionality of the sequence (no, weak or strong transcription activation of two response elements). They find that pairwise epistasis plays a crucial and dominant role at creating functional sequences and at connecting the functional sequence space.

      I enjoyed reading the paper and the topic (role of epistasis at creating and connecting functional sequences; development of measures of epistasis) is very exciting to me. However, I found it difficult to judge the strength of the paper both because it is written in a rather dense and yet potentially redundant fashion (see comment 1) and because I was left with a number of questions upon reading. I will focus on conceptual questions in the following comments, since I am not able to judge the statistical approach in detail.

      1/ Regarding the biological result (importance of pairwise epistasis) I was wondering how potentially redundant the consecutive sections of the paper are. In which situation would the authors expect that pairwise epistasis does *not* play a crucial role for mutational steps, trajectories, or space connectedness, if it is dominant in the genotype-phenotype landscape? I would also appreciate an explanation of how much new biological results this paper delivers as compared with the paper in which the data were published (which I, unfortunately, cannot access at the moment of writing this report).

      2a/ Regarding the regression approach: I very much appreciate a reference-free approach to the estimation of epistasis. However, I would enjoy an explanation of how the results would have been (potentially) different if a reference-based approach was used, and how it compares with other reference-free approaches to estimating epistasis (e.g., linear regression or the gamma statistics of Ferretti et al. 2015).

      2b/ When comparing the outcomes with and without epistasis, I understood that the authors compare the estimated "full model" with the outcome if epistatic effects were ignored - but without a new estimation of main effects if epistasis is ignored. Wouldn't that be a more fair comparison?

      2c/ Where do the authors see the applicability of their approach to data beyond those analyzed in the present study? What are the requirements to use it? Does it only work for combinatorially complete landscapes? I did not have a chance to look at the code - how easily could other researchers apply the approach to their data?

    1. Reviewer #2 (Public Review):

      This study aims to describe a physical interaction between the kinase DYRK1A and the Tuberous Sclerosis Complex proteins (TSC1, TSC2, TBC1D7). Furthermore, this study aims to demonstrate that DYRK1A, upon interaction with the TSC proteins regulates mTORC1 activity and cell size. Additionally, this study identifies T1462 on TSC2 as a phosphorylation target of DYRK1A. Finally, the authors demonstrate the role of DYRK1A on cell size using human, mouse, and Drosophila cells.

      This study, as it stands, requires further experimentation to support the conclusions on the role of DYRK1A on TSC interaction and subsequently on mTORC1 regulation. Weaknesses include, 1) The lack of an additional assessment of cell growth/size (eg. protein content, proliferation), 2) the limited data on the requirement of DYRK1A for TSC complex stability and function, and 3) the limited perturbations on the mTORC1 pathway upon DYRK1A deletion/overexpression. Finally, this study would benefit from identifying under which nutrient conditions DYRK1A interacts with the TS complex to regulate mTORC1.

      The interaction described here is highly impactful to the field of mTORC1-regulated cell growth and uncovers a previously unrecognized TSC-associated interacting protein. Further characterization of the role that DYRK1A plays in regulating mTORC1 activation and the upstream signals that stimulate this interaction will be extremely important for multiple diseases that exhibit mTORC1 hyper-activation.

    1. Reviewer #3 (Public Review):

      This work provides a novel design of implantable and high-density EMG electrodes to study muscle physiology and neuromotor control at the level of individual motor units. Current methods of recording EMG using intramuscular fine-wire electrodes do not allow for isolation of motor units and are limited by the muscle size and the type of behavior used in the study. The authors of myomatrix arrays had set out to overcome these challenges in EMG recording and provided compelling evidence to support the usefulness of the new technology.

      Strengths:

      • They presented convincing examples of EMG recordings with high signal quality using this new technology from a wide array of animal species, muscles, and behavior.<br /> • The design included suture holes and pull-on tabs that facilitate implantation and ensure stable recordings over months.<br /> • Clear presentation of specifics of the fabrication and implantation, recording methods used, and data analysis

      Weaknesses:

      • The justification for the need to study the activity of isolated motor units is underdeveloped. The study could be strengthened by providing example recordings from studies that try to answer questions where isolation of motor unit activity is most critical. For example, there is immense value for understanding muscles with smaller innervation ratio which tend to have many motor neurons for fine control of eyes and hand muscles.

    1. Author Response:

      We would like to thank the Editors and Reviewers for their positive evaluations, constructive comments, and for the opportunity to revise our manuscript. We feel that the comments and suggestions will further improve our manuscript.

      In the updated manuscript we aim to incorporate all suggested changes and considerations provided by the Reviewers. In particular, we will provide further information on the quality-control ratings per subfield, as suggested by Reviewer 1. Moreover, we will evaluate whether the training-related changes were specific to CA1-3, rather than just showing significant alterations in CA1-3 and not in the other subfields. Last, as suggested by Reviewer 2, we will additionally test for multivariate associations between hippocampal subfield structure and function, to further evaluate the specificity of hippocampal subfield change as a function of training and cortisol.

    1. Author Response

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Recommendations For The Authors):

      This study is well presented and contains all the necessary experiments to support their claims. They made the interesting finding of an additional factor Dyn2. However, it is unclear whether it is present in the human complex. Hence, it would be interesting to see whether Dyn2 co-purifies when expressed with the other complex components in insect cells. Also, purification of a tagged complex from yeast would have indicated whether Dyn2 is part of the complex and whether other factors, like RBM15 or Hakai, present in humans are also present in yeast.

      We agree that Dyn2 subunit is an exciting new finding that is worth further investigation. The IP-MS experiments suggest that Dyn2 is subunit of the complex and that the Dyn2 interaction is mediated via Slz1. We also noticed a reduction in m6A levels (50%) in the dyn2 deletion mutant. What the function of Dyn2 is and whether it is conserved remains to be determined.

      Our IP-MS experiments with Mum2 identified the complex as described in the manuscript, however we did not find evidence of orthologs of RBM15 and Hakai. More follow up work is needed using in vivo and in vitro assays are needed to determine how m6A by the yeast MTC is regulated.

      P3 top: Although m6A is the most abundant internal methylation variant, it is far below the methylation levels of cap-adjacent nucleotides in mammalian mRNAs (PMID: 35970556 ).

      We have added the word “internal” to the first sentence of the introduction.

      A list of author contributions is missing.

      We have added this in the revised version.

      Reviewer #2 (Recommendations For The Authors):

      Most of the conclusions of this paper are well supported by data, and the text is clearly written and easy to read. Here are my suggestions and comments:

      1) In Fig.2, why not use LC-MS to measure m6A levels in Ygl036w, Dyn2, Pab1, Npl3 mutants, as in Fig.1?

      For measuring m6A levels, we use combination of LC-MS and m6A ELISA and m6A-seq2 throughout the manuscript. We used ELISA in the Fig2 because we had established this assay in the lab (Ensinck et al, RNA Journal, 2023). M6A-ELISA technique was more accessible and easier to execute compared to LC-MS. Additionally our collaborator for the LC-MS moved his lab to another country, which made it impractical to continue the use of LC-MS.

      2) The protein purification experiment described in Fig. 4D is informative. Can they include Dyn2 in the expression system as well?

      Thank you for the suggestion. Dyn2 was not the focus of the manuscript as Dyn2 has, at best, only a minor role in m6A deposition in vivo. We are also currently aiming to dissect how Dyn2 regulates m6A and the yeast MTC in follow up work. Hence we decided not to add more experiments on Dyn2 to the current manuscript.

      3) Among the MTC components identified in this study, Dyn2 is a new and interesting subunit. It was shown that in C. elegans Dlc1 is involved in stabilizing the m6A writer Mett10. I wonder if yeast has a homolog of C. elegans Mett10?

      As far as we know, there is no ortholog identified of Mett10 (METTL16 in mammals) in budding yeast.

      4) The authors have emphasized "the m6A dependent and independent functions"; however, this is only based on previous observations. Is it possible that the less severe phenotype associated with ime4 catalytic mutant is due to residual catalytic activity? I think the data presented in Fig. 5 tell us that Ime4 and other MTC subunits have no additional moonlighting function. It is not entirely clear to me what "the m6A-independent function" is.

      The observation that the yeast MTC complex has m6A dependent and independent function is based on the previous observations and the current work. In Agarwala et al 2012 PLOS Genetics, it was shown that mum2 and ime4 deletion mutants have more severe phenotype than slz1 deletion mutant or the catalytically inactive mutant of Ime4. We confirmed these observations in the revised manuscript (see Figure S5A and S5B). In this work, we showed that kar4 and vir1 deletion mutants have comparable delay in the onset of meiosis as mum2 and ime4 deletion mutants. Also, the MTC remains intact with absence of Slz1, but falls apart in ime4D, mum2D, vir1D or showed strongly reduced RNA binding (kar4 deletion mutant). Based on this we conclude that an m6A independent function of the MTC exists.

      We have included data demonstrating that the catalytically inactive mutant has no residual m6A and a milder meiotic phenotype compared to the ime4 deletion mutant (see Figure S5A and S5B).

      5) In Mum2-TEV-ProA IP (1B) and Kar4-TEV-ProA IP (S1A), Slz1 was not significantly enriched; however, in the repeated Mum2-TEV-ProA IP with/without RNAse (S1B, 4C), Slz1 was strongly enriched. Why are the Slz1 results so variable?

      This is an astute observation, for which we do not have a definitive answer. One possibility is that Slz1 is the only subunit that is induced during meiosis. It is possible that induction of Slz1 varied between the different IP-MS experiments, hence leading to variability in its association with the MTC complex.

      6) The last paragraph on page 11, "Collectively...", and the first paragraph on page 12, "Collectively...", seem redundant.

      We have removed the duplicated paragraph in the revised manuscript.

    1. Author Response

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      MCM8 and MCM9 are paralogues of the eukaryotic MCM2-7 proteins. MCM2-7 form a heterohexameric complex to function as a replicative helicase while MCM8-9 form another hexameric helicase complex that may function in homologous recombination-mediated longtract gene conversion and/or break-induced replication. MCM2-7 complex is loaded during the low Cdk period by ORC, CDC6, and Cdt1, when the origin DNA may intrude into the central channel via the MCM2-MCM5 entry "gate". In the S phase, MCM2-7 complex is activated as CMG helicase with the help of CDC45 and GINS complex. On the other hand, it still remains unclear how MCM8-9 complex is loaded onto DNA and then activated.

      In this study, the authors first investigated the cryo-EM structure of chicken MCM8-9 (gMCM89) complex. Based on the data obtained, they suggest that the observed gMCM8-9 structure might represent the structure of a loading state with possible DNA entry "gate". The authors further investigated the cryo-EM structure of human MCM8-9 (hMCM8-9) complex in the presence of the activator protein, HROB, and compared the structure with that obtained without HROB1, which the authors published previously. As a result, they suggest that MCM8-9 complex may change the conformation upon HROB binding, leading to helicase activation. Furthermore, based on the structural analyses, they identified some important residues and motifs in MCM8-9 complex, mutations of which actually impaired the MCM8-9 activity in vitro and in vivo.

      Overall, the data presented would support the authors' conclusions and would be of wide interest for those working in the fields of DNA replication and repair. One caveat is that most of the structural data are shown only as ribbon model without showing the density map data obtained by cryo-EM, which makes accurate evaluation of the data somewhat difficult.

      We thank the reviewer for the positive comments on our work. For evaluating all the structural data, in our revised manuscript, we have presented the density maps of the cryo-EM structures of the gMCM8/9 complex in supplementary figure S5 and S6. In addition, the 3D cryo-EM map of the gMCM8/9 complex and the hMCM8/9 NTD ring have been deposited to the EMDB database with accession number EMD-32346 and EMD-33989, respectively. The corresponding atomic models have been deposited at the RSCB PDB under the accession code 7W7P and 7YOX, respectively. All these data have been released in May 2023.

      Reviewer #2 (Public Review):

      MCM8 and MCM9 together form a hexameric DNA helicase that is involved in homologous recombination (HR) for repairing DNA double-strand breaks. The authors have previously reported on the winged-helix structure of the MCM8 (Zeng et al. BBRC, 2020) and the Nterminal structure of MCM8/9 hexametric complex (MCM8/9-NTD) (Li et al. Structure, 2021). This manuscript reports the structure of a near-complete MCM8/9 complex and the conformational change of MCM8/9-NTD in the presence of its binding protein, HROB, as well as the residues important for its helicase activity.

      The presented data might potentially explain how MCM8/9 works as a helicase. However, additional studies are required to conclude this point because the presented MCM8/9 structure is not a DNA-bound form and HROB is not visible in the presented structural data. Taking into these accounts, this work will be of interest to biologists studying DNA transactions.

      A strength of this paper is that the authors revealed the near-complete MCM8/9 structure with 3.66A and 5.21A for the NTD and CTD, respectively (Figure 1). Additionally, the authors discovered a conformational change in the MCM8/9-NTD when HROB was included (Figure 4) and a flexible nature of MCM8/9-CTD (Figure S6 and Movie 1).

      The biochemical data that demonstrate the significance of the Ob-hp motif and the N-C linker for DNA helicase activity require careful interpretation (Figures 5 and 6). To support the conclusion, the authors should show that the mutant proteins form the hexamer without problems. Otherwise, it is conceivable that the mutant proteins are flawed in complex formation. If that is the case, the authors cannot conclude that these motifs are vital for the helicase function.

      A weakness of this paper is that the authors have already reported the structure of MCM8/9NTD utilizing human proteins (Li et al. Structure, 2021). Although they succeeded in revealing the high-resolution structure of MCM8/9-NTD with the chicken proteins in this study, the two structures are extremely comparable (Figure S2), and the interaction surfaces seem to be the same (Figure 2).

      Another weakness of this paper is that the presented data cannot fully elucidate the mechanistic insights into how MCM8/9 functions as a helicase for two reasons. 1) The presented structures solely depict DNA unbound forms. It is critical to reveal the structure of a DNA-bound form. 2) The MCM8/9 activator, HROB, is not visible in the structural data. Even though HROB caused a conformational change in MCM8/9-NTD, it is critical to visualize the structure of an MCM8/9HROB complex.

      We appreciate the reviewer’s comments on our work. Regarding the first weakness mentioned above, the previously reported cryo-EM structure of hMCM8/9 NTD ring was achieved with a resolution of 6.6 Å. At this level of resolution, we were only able to observe the overall shape of the structure and a partial representation of the protein's secondary structure. It is hard for us to discern any specific details regarding the interaction interface between MCM8 and MCM9. In this study, we solved the structure of gMCM8/9 NTD ring with a resolution of 3.67 Å. We believe that the higher resolution of gMCM8/9 NTD structure provides a significant advantage in analyzing the interaction surface between MCM8 and MCM9. This improved resolution has enabled us to gain valuable insights into the assembly mechanism of the MCM8/9 hexamer, representing a significant step forward in our understanding of the MCM8/9 helicase complex. In response to the second weakness raised by the reviewer, we fully agree with the reviewer that high-resolution structures of the MCM8/9 complex with DNA or HROB are necessary to elucidate the mechanism of this helicase complex. We are actively working towards obtaining these complex structures using cryo-EM and X-ray crystal diffraction.

      Moreover, we would like to address the reviewer's concern regarding the mutant proteins used in the in vitro helicase assays. We have conducted additional experiments to confirm that these mutant proteins do not impair the formation of the MCM8/9 hexamer. Specifically, we performed size exclusion chromatography (SEC) analyses of the wild-type (WT) MCM8/9 complex, as well as MCM8 and MCM9 mutant proteins (Author response image 1). The results demonstrated that all the proteins behaved consistently and displayed similar SEC profiles during the purification process. Notably, the N-C linker deletion mutant (hMCM8_Δ369-377+MCM9_Δ283-287) combining the MCM8 and MCM9 N-C linker deletions also behaved similarly with WT MCM8/9 (Author response image 2). These findings strongly suggest that the mutations in the OB-hps regions and the N-C linkers do not disrupt the hexamer formation of the MCM8/9 complex. Author response image 1 and Author response image 2 have been included into the supplementary figure S8 and S11, respectively.

      Author response image 1.

      SEC profiles of WT and OB-hps mutants of MCM8/9 complex.

      Author response image 2.

      SEC profiles of WT and N-C linker mutant of MCM8/9 complex.

      Reviewer #1 (Recommendations For The Authors):

      I would like to provide some suggestions to improve the manuscript.

      1) Throughout the manuscript, more density map data obtained by the cryo-EM should be shown for accurate evaluation of the data. For example, in Figure 1C, the authors state that inner channel of the gMCM8-9 hexamer is ~28 angstrom, apparently based on the ribbon model. This is not appropriate because the space upon ribbon model is not same as that upon the density map. For Figure 1B, they state that "The domain structures of gMCM8-9 fit well into their electron map". If so, please show the actual docking data. Also for Figure 2, the docking presentation between the side chains in the ribbon model and the density map should be shown.

      We sincerely appreciate the reviewer for the constructive suggestions. In addition to releasing our structural data in the EMDB and PDB, we have also followed the reviewer’s suggestions to included more density map data in the supplementary material. In fact, when calculating the dimeter of the inner channel of the MCM8/9 hexamer, we also measured that upon the density map (Author response image 3. A and B), which is consistent with our report in our manuscript. To further evaluate the structure of MCM8/9, we have included additional docking structures based on the density map (Author response image 3. C-F). Moreover, for Figure 2, more docking presentation are provided and the key residues involved in the hydrophobic interactions were highlighted in a bold manner (Author response image 4). Author response image 3 and Author response image 4 have been included into the supplementary figure S5 and S6, respectively.

      Author response image 3.

      The cryo-EM structure of gMCM8/9. (A and B) Reconstructed cryo-EM map of gMCM8/9. The diameter of the inner channel of MCM8/9 was measured at ~28 Å. (C-F) Representative regions of the cryo-EM structure of gMCM8/9 NTD are shown based on their density map. C, chain A (MCM9); D, chain B (MCM8); E, chain C (MCM9); F, chain D (MCM8).

      Author response image 4.

      Representative regions of the cryo-EM structure of gMCM8/9 NTD. (A and B), the region mediated hydrophobic interaction in figure 2B. A (MCM8), B (MCM9). (C and D), the region mediated hydrophobic interaction in figure 2C. C (MCM8), D (MCM9). The key residues were in bold.

      2) Figures 4, 5, and 6: For helicase assay, more detailed experimental conditions (e.g. concentrations of DNA substrates and proteins used) should be presented. In addition, it should be described how Flag-hMCM8-9 complex (Figure 4C) was purified.

      We sincerely appreciate the constructive suggestion provided by the reviewer. In the revised manuscript, we have included more experimental details in the helicase assays, including the concentrations of DNA substrates and proteins. The following paragraph describes the updated experimental procedure and also provided in the revise version of the manuscript.

      Helicase assays: To prepare the substrate, the oligonucleotide (5'(dT)40GTTTTCCCAGTCACGACG-TTGTAAAACGACGGCCAGTGCC-3') containing a 40 nt region complementary to the M13mp18(+) stand and a 40 nt oligo-dT at the 5′ end was labeled at the 3′ terminus with [α-32P] dCTP (Perkin Elmer) and annealed to the single-stranded DNA M13mp18 (24). 0.1 nM (in molecules) DNA substrates were respectively mixed with 5 µg recombinant MCM8/9 complex and its mutants as indicated within each 15 µl volume reaction in the helicase buffer (25 mM HEPES, pH 7.5, 1 mM magnesium acetate, 25 mM sodium acetate, pH 5.2, 4 mM ATP, 0.1 mg/ml BSA, 1 mM DTT). 2.5 µg HROB was used as an activator. To avoid re-annealing, the reaction was supplemented with a 100-fold unlabeled oligonucleotide. The reactions were then incubated at 37 °C for 60 min and stopped by adding 1 µl of stop buffer (0.4% SDS, 30 mM EDTA, and 6% glycerol) and 1µl of proteinase K (20 mg/ml, Sigma) into the reaction for another 10 min incubation at 37 °C. The products were separated by 15% polyacrylamide gel electrophoresis in 1× TBE buffer and analyzed by the Amersham typhoon (Cytiva).

      In addition, to describe the expression of Flag-hMCM8/9 complex in Figure 4C, we have included the Pull-Down Assay in the “Material and Methods” section. The description is as follow: The HEK293T cells transfected with Flag-hMCM8/9-FL or Flag-hMCM8/9-NTD were cultured overnight and washed twice with cold phosphate-buffered saline (PBS). Cell pellets were resuspended with lysis buffer (20 mM Tris, pH7.5, 150 mM NaCl, 5mM EDTA, 0.5% NP-40, 10% glycerol, protease inhibitor cocktail (Roche, 04693132001)). After incubation for 45 min at 4°C with gentle agitation, the whole-cell lysates were collected by centrifugation (12,000 × g for 15 min, at 4 °C). GST beads coupled with 2 μg GST-HROB or GST alone were then incubated with an equal volume of above HEK293T cell lysates at 4°C for 4h. The beads were washed four times with lysis buffer. Proteins bound to the beads were separated by SDS–PAGE and subsequently immunoblotted with anti-Flag antibody (Cytiva).

      3) Figure 3C: This is just an assumed model. Please clearly state it in the manuscript.

      We appreciate the reviewer’s comment. We guess the reviewer is referring to Figure 5C. As Figure 3C depicts the top view of the gMCM8/9 hexamer structurally aligned with the MCM2-7 double hexamer (wheat) by aligning their respective C-tier ring. On the other hand, Figure 5C represents an assumed model where we docked a forked DNA fragment into the central channel of the gMCM8/9 hexamer. To address this assumed model, we have made the following clarification in the revised manuscript: “We artificially docked a forked DNA into the central channel to generate a gMCM8/9-DNA model and found that the OB-hps of gMCM8 are capable to closely contact with it and insert their highly positively charged terminal loops into the major or minor grooves of the DNA strand, implying that they could be involved in substrate DNA processing and/or unwinding (Figure 5C)”.

      4) Figure S1, C and D: The coloring of the gMCM8-9 CTD appears to show higher resolution than the NTD. May this be mispresentation?

      We appreciate the reviewer's valuable feedback, and we have thoroughly re-evaluated Figure S1C and D. At the beginning, the local resolution distributions of the gMCM8/9 NTD and gMCM8/9 CTD were calculated using CryoSPARC. Upon re-examination, we found that the density maps of the gMCM8/9 CTD may be lower than 3.66 Å, because the density map of the gMCM8/9 CTD does not reveal more structural details than what is observed in the gMCM8/9 NTD. Thus, although the map shown in Figure S1D may appear to show a greater distribution of high-resolution regions., we would like to clarify that this discrepancy could be attributed to an optical illusion. We thank the reviewer for bringing this to our attention.

      5) Figure S9: Is the "mean resolution" 5.21 angstrom identical to the Gold standard FSC? If not, please estimate the resolution using FSC, like other maps in this paper.

      We thank the reviewer for the constructive suggestion. In response to this feedback, we would like to clarify the resolution estimation process for the gMCM8/9 CTD. Initially, we calculated the resolution of the gMCM8/9 CTD using the gold standard Fourier shell correlation (FSC) method, which yielded a resolution of 3.66 Å. However, upon further analysis, we identified an issue with the GSFSC Resolution curves, which led to an overestimation of the resolution based on the density map of the gMCM8/9 CTD. To ensure a more reliable and accurate estimation, we employed the Phenix software package to calculate the mean resolution during the refinement process of the gMCM8/9 CTD structure. The calculated mean resolution was determined to be 5.21 Å, which aligns more reasonably with the characteristics of the density map. To address any potential misunderstandings and provide clarity, we have explicitly labeled and described the evaluation process for this mean resolution in the "Single particle data processing" section of the Materials and Methods.

      Minor points:

      1) Throughout the manuscript, there are several typographical and grammatical errors, which should be corrected. For example, in "Introduction", "GNIS complex" should be "GINS complex".

      We thank the reviewer for pointing out the typographical and grammatical errors. We have corrected the grammar errors and polished our manuscript with the help of native speakers.

      Reviewer #2 (Recommendations For The Authors):

      1) "During HR repair, MCM8/9 was rapidly recruited to the DNA damage sites and colocalized with the recombinase Rad51 (21). It also interacted with the nuclease complex MRN (MRE11RAD50-NBS1) and was required for DNA resection at DSBs to facilitate the HR repair (Introduction)."

      There is a debate about whether MCM8/9-HROB colocalizes with RAD51 and whether it works upstream or downstream of RAD51 (Park et al. MCB, 2013; Lee et al. Nat Commun., 2015; Lutzmann et al. Mol Cell, 2012; Nishimura et al. Mol Cell, 2012; Natsume et al. G&D, 2017; Hustedt et al. G&D, 2019; Huang et al. Nat Commun., 2020).

      We completely agree with the reviewer that previous studies have reported contradictory results regarding to the function of MCM8/9 in homologous recombination. Based on the structure information of MCM8/9, now we do not have direct evidence to resolve the ongoing debate. Nonetheless, based on our findings, we speculate that the MCM8/9 complex is likely involved in multiple steps within the process of homologous recombination. The structural insights provided by our study serve as a foundation for further investigations and may contribute to a better understanding of the complex and multifaceted roles of MCM8/9 in homologous recombination repair.

      2) I noted that the BioRxiv version 1 (https://www.biorxiv.org/content/10.1101/2022.01.26.477944v1?versioned=true) contains a near-complete MCM8/9 with human protein based on the crystal analysis. Because its structure is comparable to chicken MCM8/9 revealed by cryo-EM, I highly suggest including this data in the manuscript.

      We would like to thank the reviewer for this suggestion. The resolution of the hMCM8/9 crystal structure presented in our previous BioRxiv version is 6.6 Å, which is a little low. Moreover, it cannot provide more information than the present cryo-EM structures of MCM8/9. We are dedicated to optimizing the crystal quality and implementing strategies to enhance the resolution of the structure. We hope to present an improved crystal structure of hMCM8/9 in our forthcoming article.

    1. Author Response

      The following is the authors’ response to the original reviews.

      We thank the reviewers for their insightful comments. The main issue raised by the reviewers was that because E6AP depletion reduced checkpoint signaling vis MASTL upregulation, this pathway is likely to be involved also in DNA damage checkpoint activation, in addition to checkpoint recovery. Hence, the proposed “timer”-like model is not fully supported. However, it is important to note that, the expression level of MASTL is not upregulated during the activation stage of the DNA damage checkpoint (unless E6AP is depleted). DNA damage signaling, via ATM-dependent E6AP phosphorylation, caused MASTL accumulation over time. This ultimately shifts the balance toward checkpoint recovery and cell cycle re-entry. As such, the role of MASTL (and E6AP-depletion) in suppressing DNA damage checkpoint is in harmony with the proposed role of MASTL upregulation in promoting checkpoint recovery. We have made additional clarifications about this point in the revised manuscript.

      We have also addressed other concerns raised by the reviewers, as explained in the point-to-point responses below. With the addition of new modifications and data, we believe the revised manuscript is complete and conclusive.

      Reviewer #1 (Public Review):

      In principle a very interesting story, in which the authors attempt to shed light on an intriguing and poorly understood phenomenon; the link between damage repair and cell cycle re-entry once a cell has suffered from DNA damage. The issue is highly relevant to our understanding of how genome stability is maintained or compromised when our genome is damaged. The authors present the intriguing conclusion that this is based on a timer, implying that the outcome of a damaging insult is somewhat of a lottery; if a cell can fix the damage within the allocated time provided by the "timer" it will maintain stability, if not then stability is compromised. If this conclusion can be supported by solid data, the paper would make a very important contribution to the field.

      However, the story in its present form suffers from a number of major gaps that will need to be addressed before we can conclude that MASTL is the "timer" that is proposed here. The primary concern being that altered MASTL regulation seems to be doing much more than simply acting as a timer in control of recovery after DNA damage. There is data presented to suggest that MASTL directly controls checkpoint activation, which is very different from acting as a timer. The authors conclude on page 8 "E6AP promoted DNA damage checkpoint signaling by counteracting MASTL", but in the abstract the conclusion is "E6AP depletion promoted cell cycle recovery from the DNA damage checkpoint, in a MASTL-dependent manner". These 2 conclusions are definitely not in alignment. Do E6AP/MASTL control checkpoint signaling or do they control recovery, which is it?<br /> Also, there is data presented that suggest that MASTL does more than just controlling mitotic entry after DNA damage, while the conclusions of the paper are entirely based on the assumption that MASTL merely acts as a driver of mitotic entry, with E6AP in control of its levels. This issue will need to be resolved.

      We thank the reviewer for his/her insightful comments. The main issue raised by the reviewers was that because E6AP depletion reduced checkpoint signaling vis MASTL upregulation, this pathway is likely to be involved also in DNA damage checkpoint activation, in addition to checkpoint recovery. Hence, the proposed “timer”-like model is not fully supported. However, it is important to note that, the expression level of MASTL is not upregulated during the activation stage of the DNA damage checkpoint (unless E6AP is depleted). DNA damage signaling, via ATM-dependent E6AP phosphorylation, caused MASTL accumulation over time. This ultimately shifts the balance toward checkpoint recovery and cell cycle re-entry. As such, the role of MASTL (and E6AP-depletion) in suppressing DNA damage checkpoint is in harmony with the proposed role of MASTL upregulation in promoting checkpoint recovery. We have made additional clarifications about this point in the revised manuscript.

      As suggested by the reviewer, we have rephrased the statement in abstract to “E6AP depletion reduced DNA damage signaling, and promoted cell cycle recovery from the DNA damage checkpoint, in a MASTLdependent manner”.

      As a mitotic kinase, MASTL promotes mitotic entry and progression. This is well in line with our findings that DNA damage-induced MASTL upregulation promotes cell cycle re-entry into mitosis. MASTL upregulation could also inhibit DNA damage signaling. This manner of feedback, inhibitory, modulation of DNA damage signaling by mitotic kinases (e.g., PLK1, CDK) has been implicated in previous studies (reviewed in Cell & Bioscience volume 3, Article number: 20 (2013)). In the revised manuscript, we have included more discussions about this aspect of checkpoint regulation.

      Finally, the authors have shown some very compelling data on the phosphorylation of E6AP by ATM/ATR, and its role in the DNA damage response. But the time resolution of these effects in relation to arrest and recovery have not been addressed.

      Detailed time point information is now added in the figure legends for E6AP phosphorylation data. We were able to observe this event during early stages (e.g., 1 hr, or 2-4 hr) of the DNA damage response, prior to significant MASTL protein accumulation.

      Reviewer #2 (Public Review):

      This is an interesting study from Admin Peng's laboratory that builds on previous work by the PI implicating Greatwall Kinase (the mammalian gene is called MASTL) in checkpoint recovery.

      The main claims of this study are:

      1) Greatwall stability is regulated by the E6-AP ubiquitin ligase and this is inhibited following DNA damage in an ATM dependent manner.

      2) Greatwall directly interacts with E6-AP and this interaction is suppressed by ATM dependent phosphorylation of E6-AP on S218

      3) E6-AP mediates Greatwall stability directly via ubiqitylation

      4) E6-AP knock out cells show reduced ATM/ATR activation and quicker checkpoint recovery following ETO and HU treatment

      5) Greatwall mediated checkpoint recovery via increased phosphorylation of Cdk substrates

      In my opinion, there are several interesting findings presented here but the overall model for a role of the E6-AP -Greatwall axis is not fully supported by the current data and will require further work. Moreover, there are a number of technical issues making it difficult to assess and interpret the presented data.

      Major points:

      1) The notion that Greatwall is indeed required for checkpoint recovery hinges on two experiments shown in Figures 5A and B where Greatwall depletion blocks the accumulation of HELA cells in mitosis following recovery from ETO treatment and in G2/M following release from HU. An alternative possibility to the direct involvement of Greatwall in checkpoint recovery could be that Greatwall in HeLA cells is required for S-phase progression (as for example Charrasse et al. suggested). A simple control would be to monitor the accumulation of mitotic cells by microscopy or FACS following Greatwall depletion without any further checkpoint activation.

      We thank the reviewer for his/her insightful comments.

      Charrasse et al. showed ENSA knockout prolonged, but not stopped the progression of S-phase. In our experiments, MASTL (partial) knockdown did not significantly impact HeLa cells proliferation in the absence of DNA damage (Fig. 5, supplemental 1A). The reported role of MASTL in checkpoint recovery was consistently seen in response to various drugs, including etoposide which typically induces G2 arrest. Thus, we do not believe a prolonged S-phase accounts for the checkpoint recovery phenotype.

      2) The changes in protein levels of Greatwall and the effects of E6-AP on Greatwall stability are rather subtle and depend mostly on a qualitative assessment of western blots. Where quantifications have been made (Figures 2D and 4F) the loading control and the starting conditions for Greatwall (0 timepoints in the right panel) appear saturated making precise quantification impossible. I would argue that the authors should at least quantify the immuno-blots that led them to conclude on changes in Greatwall levels and make sure that the exposure times used are in the dynamic range of the camera (or film). A more precise experiment would be to use the exogenously expressed CFP-Greatwall that is described in Figure 6 and measure the acute changes in protein levels using quantitative fluorescence microscopy in live cells. This is, in my opinion, a lot more trustworthy than quantitative immuno-blots.

      I also note here that most experiments linking Greatwall levels to E6-AP were done using siRNA, while the E6-AP ko cells would be a more reliable background for these experiments, especially with reconstituted controls.

      DNA damage-induced MASTL upregulation was observed in various cell lines and after different treatments. To further strengthen this point, as suggested by the reviewer, we have included quantification of fluorescent measurements (Fig. 2, supplemental 1 A-C). Quantification of immunoblots for MASTL upregulation was also added in Fig. 1, supplemental 1E. The effects of E6AP depletion were consistently shown for both siRNA and stable KO.

      3) This study has no data linking the effects of Greatwall to its canonical target PP2A:B55. The model shown in Figure 9 is therefore highly speculative. The possibility that Greatwall could act independently of PP2A:B55 should at least be considered in the discussion given the lack of experimental evidence.

      The role of MASTL in promoting cell cycle progression via suppressing PP2A/B55 has been well established. As suggested by the reviewer, we have included discussions to acknowledge that “The role of MASTL upregulation in promoting checkpoint recovery and cell cycle progression can be attributed to inhibition of PP2A/B55, although the potential involvement of additional mechanisms is not excluded”.

      4) The major effect of E6-AP depletion on the checkpoint appears to be a striking reduction in ATM/ATR activation, suggesting that this ubiquitin ligase is involved in checkpoint activation rather than recovery. It is not clear if this phenotype is dependent on Greatwall. If so it would be hard to reconcile with the default model that E6-AP acts via the destabilisation of Greatwall. In the permanent absence of E6-AP, increased Greatwall levels should inactivate B55:PP2A. How would this lead to a decrease in ATM/ATR activation? This is unlikely, and indeed Figure 5E shows that the reduction of MASTL in parallel to E6-AP does not result in elevated levels of ATR/ATM activation. Conversely, the S215A E6-AP mutant does have a strong rescue impact on ATR/ATM (Figure 8D).

      We do not propose that PP2A/B55 directly dephosphorylates ATM/ATR-mediated phosphorylation. In fact, PP2A/B55 dephosphorylates and inactivates mitotic kinases and substrates which can feedback inhibit DNA damage checkpoint signaling (as previously shown for PLK1 and CDK). We included in a discussion about this point in the revised manuscript.<br /> The point regarding checkpoint activation vs recovery is addressed below (point 5).

      5) In summary, I do not think that the presented experiments clearly dissect the involvement of E6-AP and Greatwall in checkpoint activation and recovery. E6-AP depletion has a strong effect on checkpoint activation while Greatwall depletion is likely to have various checkpoint-independent effects on cell cycle progression.

      It is important to note that, the expression level of MASTL is not upregulated during the activation stage of the DNA damage checkpoint (unless E6AP is depleted). DNA damage signaling, via ATM-dependent E6AP phosphorylation, caused MASTL accumulation over time. This ultimately shifts the balance toward checkpoint recovery and cell cycle re-entry. As such, the role of MASTL (and E6APdepletion) in suppressing DNA damage checkpoint is in harmony with the proposed role of MASTL upregulation in promoting checkpoint recovery. We have made additional clarifications about this point in the revised manuscript.

      Reviewer #3 (Public Review):

      In this manuscript, Li et al. describe the contribution of the ATM-E6AP-MASTL pathway in recovery from DNA damage. Different types of DNA damage trigger an increase in protein levels of mitotic kinase MASTL, also called Greatwall, caused by increased protein stability. The authors identify E3 ligase E6AP to regulate MASTL protein levels. Depletion or knockout of E6AP increases MASTL protein levels, whereas overexpression of E6AP leads to lower MASTL levels. E6AP and MASTL were suggested to interact in conditions without damage and this interaction is abrogated after DNA damage. E6AP was shown to be phosphorylated upon DNA damage on Ser218 and a phosphomimicking mutant does not interact with MASTL. Stabilization of MASTL was hypothesized to be important for recovery of the cell cycle/mitosis after DNA damage.

      The identification of this novel pathway involving ATM and E6AP in MASTL regulation in the DNA damage response is interesting. However, is surprising that authors state that not a lot is known about DNA damage recovery while Greatwall and MASTL have been described to be involved in DNA damage (checkpoint) recovery. In addition, PP2A, a phosphatase downstream of MASTL is a known mediator of checkpoint recovery, in addition to other proteins like Plk1 and Claspin. Although some of the publications regarding these known mediators of DNA damage recovery are mentioned, the discussion regarding the relationship to the data in this manuscript are very limited.

      We thank the reviewer for his/her insightful comments. As suggested, the previously reported role of PLK1 and other cell cycle kinases in DNA damage checkpoint recovery is discussed in more details in the revised manuscript. As for PP2A/B55, we do not think it promotes checkpoint recovery, e.g., by dephosphorylating ATM/ATR or their substrates. Instead, this phosphatase dephosphorylates cell cycle kinases or their substrates, such as CDK1 or PLK1.

      The regulation of MASTL stability by E6AP is novel, although the data regarding this regulation and the interaction are not entirely convincing. In addition, several experiments presented in this paper suggest that E6AP is (additionally) involved in checkpoint signalling/activation, whereas the activation of the G2 DNA damage checkpoint was described to be independent of MASTL. Has E6AP multiple functions in the DNA damage response or is ATM-E6AP-MASTL regulation not as straightforward as presented here?

      Altogether, in my opinion, not all conclusions of the manuscript are fully supported by the data.

      We showed that E6AP depletion reduced checkpoint signaling vis MASTL upregulation, so this pathway is likely to be involved also in DNA damage checkpoint activation, in addition to checkpoint recovery. However, it is important to note that, the expression level of MASTL is not upregulated during the activation stage of the DNA damage checkpoint (unless E6AP is depleted). DNA damage signaling, via ATM-dependent E6AP phosphorylation, caused MASTL accumulation over time. This ultimately shifts the balance toward checkpoint recovery and cell cycle re-entry. As such, the role of MASTL (and E6APdepletion) in suppressing DNA damage checkpoint is in harmony with the proposed role of MASTL upregulation in promoting checkpoint recovery. We have made additional clarifications about this point in the revised manuscript.

      Reviewer #1 (Recommendations For The Authors):

      In principle a very interesting story, that attempts to shed light on an intriguing and poorly understood phenomenon; the link between damage repair and cell cycle re-entry once a cell has suffered from DNA damage. The issue is highly relevant to our understanding of how genome stability is maintained or compromised when our genome is damaged. The authors present the intriguing conclusion that this is based on a timer, implying that the outcome of a damaging insult is somewhat of a lottery; if a cell can fix the damage within the allocated time it will maintain stability, if not then stability is compromised. However, the story in its present form suffers from a number of major gaps that will need to be addressed

      Major point:

      My primary concern regarding the main conclusion is that altered MASTL regulation seems to be doing much more than simply promoting more rapid recovery after DNA damage. This concern comes from the following gaps that I noted whilst reading the paper:

      • Knock out of E6AP, is leading to a dramatic inhibition of ATM/ATR activation after damage (Fig.5C,D,E), this is (partially) rescued by co-depletion of MASTL (Fig5E). The authors will have to show that the primary effect of altered MASTL regulation is improved recovery, rather than reduced checkpoint activation. In other words, is initial checkpoint activation in cells that have lost E6AP normal, or do these cells fail to mount a proper checkpoint response? If the latter is true, that could completely alter the take home-message of this paper, because it could mean that E6AP/MASTL do not act as a "timer", but as a "tuner" to set checkpoint strength at the start of the DNA damage response. The authors themselves conclude on page 8 "E6AP promoted DNA damage checkpoint signaling by counteracting MASTL", but in the abstract the conclusion is "E6AP depletion promoted cell cycle recovery from the DNA damage checkpoint, in a MASTL-dependent manner". These 2 conclusions are definitely not in alignment, do E6AP/MASTL control checkpoint signaling or do they control recovery?

      The expression level of MASTL is not upregulated during the activation stage of the DNA damage checkpoint (unless E6AP is depleted). DNA damage signaling, via ATM-dependent E6AP phosphorylation, caused MASTL accumulation over time. This ultimately shifts the balance toward checkpoint recovery and cell cycle re-entry. As such, the role of MASTL (and E6AP-depletion) in suppressing DNA damage checkpoint is in harmony with the proposed role of MASTL upregulation in promoting checkpoint recovery. We have made additional clarifications about this point in the revised manuscript. We have also made clarification to the statement indicated by the reviewer.

      • MASTL KD has a rather unexpected effect on cell cycle progression after HU synchronization (Fig.5B). It seems that the MASTL KD cells fail to exit from the HU-imposed G1/S arrest, an effect that is not rescued in the E6AP knock-outs. Inversely, E6AP knock-outs seem to more readily exit from the HU-imposed arrest, an effect that is completely lost after knock-down of MASTL. How do the authors interpret these results? Their conclusions are entirely based on a role for MASTL as a driver of mitotic entry, with E6AP in control of its levels, but this experiment suggests that MASTL and E6AP are controlling very different aspects of cell cycle control in their system.

      As the reviewer pointed out, our data in checkpoint signaling and cell cycle progression suggested that MASTL upregulation could also inhibit DNA damage signaling, in addition to promoting cell cycle progression. This manner of feedback, inhibitory, modulation of DNA damage signaling by mitotic kinases (e.g., PLK1, CDK) has been implicated in previous studies (reviewed in Cell & Bioscience volume 3, Article number: 20 (2013)). In the revised manuscript, we have included discussions about this aspect of checkpoint regulation.

      • It is not possible to evaluate the validity of the conclusions that are based on Figure 6. We need to know how long the cells were treated with HU to disrupt the interaction between E6AP and MASTL. Is the timing of this in the range of the timing of MASTL increase after damage? A time course experiment is required here.

      • The data obtained on E6AP-S218 phosphorylation and with the S218A mutant during damage and recovery look very promising. But again, the release from HU is confusing me as to what to conclude from them. Also, the authors should show how S218A expression affects MASTL levels (before and after damage). Also, a time course of ATM/ATR activation is required to decide if initial or late ATM/ATR signaling is affected.

      Detailed time point information is now added in the figure legends for E6AP phosphorylation and E6AP-MASTL dissociation data. We were able to observe these events during early stages (e.g., 1 hr, or 2-4 hr) of the DNA damage response, prior to significant MASTL protein accumulation.

      • The conclusion that "and was not likely to be caused by the completion of DNA repair, as judged by the phosphorylation of replication protein A" (page 5) is based on western blots that represent the average across the entire population. It is possible that MASTL expression is still low in the cells that have not completed repair, while it's increase on blots comes from a subset of cells where repair is complete. The authors should perform immunofluorescence so that expression levels of MASTL can be directly compared to levels of phospho-RPA in individual cells. In fact, the manuscript could benefit a lot from a more in-depth single-cell (microscopy)-based analysis of the relations over time between ATM/ATR activation, E6AP phosphorylation, MASTL stabilization versus the checkpoint arrest and subsequent recovery.

      Time point analyses were provided for DNA damage-induced RPA phosphorylation and ATM/ATR substrate phosphorylation (Fig. 1). These data showed MASTL accumulation in the presence of active DNA damage checkpoint signaling. To further strengthen this point, as suggested by the reviewer, we have included quantification of fluorescent measurements (Fig. 2, supplemental 1 A-C). IF data showed MASTL upregulation in correlation with ATM/ATR activation.

      Minor points:

      It's not "ionized radiation", but "ionizing radiation" (page 5)

      We have made the correction as pointed out by the reviewer.

      Expression levels of MASTL should be quantified over time after DNA damage. In some of the experiments the increase seems to plateau relatively quick (HU treatment, fig 1B, 1-2 hours), while in others the levels continue to increase over longer periods (HU treatment, fig 1D, 6 hours). This is relevant to the timer function of MASTL that is proposed here.

      The kinetics of MASTL upregulation is generally consistent among all cell lines. As suggested, quantification of immunoblots is provided (Fig. 1, supplemental 1E); additional quantification of IF signals is also included (Fig. 2, supplemental 1 A-C).

      The experiment executed with caffeine (page 5) should be repeated with more selective/potent ATM/ATR inhibitors that are commercially available.

      Specific ATM inhibitor was used to confirm the caffeine result in Fig. 7 supplemental 1B&C.

      "a potential binding pattern" (page 6) should be "a potential binding partner"

      We have made the correction as pointed out by the reviewer.

      Reviewer #2 (Recommendations For The Authors):

      1) All western blots require size markers. The FACS blots shown do not have any axis labels.

      We have included size markers for blots, at the first appearance of each antibody. Labels are added for FACS blots.

      2) The quantification of mitotic cells does not indicate how many cells were counted and if this was done by eye or using software.

      The missing experimental information is included in the figure legends, as suggested.

      3) The western blots demonstrating ubiquitylation of Greatwall (Figure 4D) are of very poor quality and impossible to interpret.

      The ubiquitination of MASTL did not show clear ladders, possibly due to its relative protein size.

      Reviewer #3 (Recommendations For The Authors):

      Specific suggestions to improve the manuscript:

      1) Include literature regarding known mediators of DNA damage checkpoint recovery, including MASTL/Greatwall and PP2A, in the manuscript and discuss the observations from this manuscript in relationship with the literature.

      Related literatures are included in the discussion.

      2) The increase in MASTL protein levels upon DNA damage are not always clear, for example Fig. 1A. The same for MASTL stability after DNA damage, such as in Fig. 2C. Quantification of the westerns would help demonstrating a significant effect.

      As suggested by the reviewer, we have included quantification of fluorescent measurements (Fig. 2, supplemental 1 A-C). Quantification of immunoblots for MASTL upregulation was also added in Fig. 1, supplemental 1E.

      3) The E6AP-MASTL in vitro interaction studies shown in Fig. 3 raise doubts. First, beads only are used as negative control, whereas MBP only-beads are a better control. The westerns in top panels of 3B (MASTL), 3C (GST-MASTL) and 3D (MASTL) should be improved. In addition, in Fig. 3C, different GSTMASTL fragments are used in an MBP-E6AP pull down, but the GST-MASTL input does not show any specific band to demonstrate that these fragments are correct. The same for the GFP-E6AP fragments in Fig. 3 Suppl. 1C The input does not show any proteins, there is no N fragment present in the IP and the size of the fragment N3 in the IP GFP does not seem correct.

      Altogether, it makes me doubt that the interaction between E6AP and MASTL is direct. Better data with appropriate controls should show whether the interaction is direct or mediated via another protein.

      Purified proteins used for the in vitro interaction had significant degradation, causing many bands in the input. We included a lighter exposure of the input here as Author response image 1. MBP alone did not bind MASTL, as both M and C segments of MASTL were MBP-tagged, and did not pull down MASTL. We agree with the reviewer that our direct interaction data showed rather weak MASTL/E6AP interaction, suggesting the interaction is dynamic or possibly mediated by additional binding proteins. We have included this statement in the revised manuscript “Taken together, our data characterized MASTL-E6AP association which was likely mediated via direct protein interaction, although the potential involvement of additional binding partners was not excluded”.

      Author response image 1.

      4) Fig. 4B. Overexpression of HA-E6AP results in a decrease in MASTL protein levels. Can this effect be rescued by treatment with proteasome inhibitor MG132?

      As expected, MG132 stabilized MASTL, with or without E6AP overexpression. We have added this new data in Fig. 4, supplemental 1B.

      5) Fig. 4G. MASTL interacts with HA-ubiquitin in WT, but not E6AP KO cells. These cells are treated with MG132, so if E6AP really ubiquitinates MASTL, I would expect MASTL to be polyubiquitinated. However, the "interaction signal" does not show polyubiquitination. In fact, this band actually runs lower than MASTL in input samples, which even could be an artifact. Please explain.

      The ubiquitination of MASTL did not show clear ladders, possibly due to its relative protein size. As the reviewer noted, the band position in the HA-Ub IP lanes seemed slightly shifted, compared to the input. We have noticed in many experiments that bands in the IP lanes did not perfectly align with the input lanes.

      6) The DNA damage recovery experiments measuring mitotic index after washing off etoposide (Fig. 5A and Fig. 8A): What are the time points taken? And importantly, why are there no error bars on these intermediate time points, but only on the 4 hour time point?

      As suggested, time point information and additional error bars are included.

      7) Fig. 5E. According to the authors, depletion of MASTL rescues the effect of KO of E6AP. However, no increase in pATM/ATR substrate signal is seen upon etoposide treatment in these samples so I am not convinced this experiment demonstrates a rescue.

      The rescue was evident, especially for many high molecular weight bands which were more effectively detected by this phospho-specific antibody.

      8) Fig. 5C and 8D strongly suggest that E6AP is involved in checkpoint activation. How do these data relate to DNA damage recovery? Is the recovery in E6AP KO cells faster as a consequence of reduced checkpoint signaling or is the recovery effect really specific by stabilization of MASTL? These data should be explained, also taken the data from Wong et al. (Sci. Rep. 2016) into account, that demonstrate that G2 checkpoint activation is independent of MASTL.

      The expression level of MASTL is not upregulated during the activation stage of the DNA damage checkpoint (unless E6AP is depleted). DNA damage signaling, via ATM-dependent E6AP phosphorylation, caused MASTL accumulation over time. This ultimately shifts the balance toward checkpoint recovery and cell cycle re-entry. As such, the role of MASTL (and E6AP-depletion) in suppressing DNA damage checkpoint is in harmony with the proposed role of MASTL upregulation in promoting checkpoint recovery. We have made additional clarifications about this point in the revised manuscript.

      9) The model presented in Fig. 9 is puzzling because there does not seem to be a difference between phosphorylation of E6AP and the interaction with MASTL on early versus late times after DNA damage. And this exactly is what is missing in the manuscript: A more detailed evaluation of the timing of E6APSer218 phosphorylation and the E6AP-MASTL interaction in response to DNA damage.

      More clarification is given to explain this model in the figure legend of Fig. 9.<br /> Time point analyses were provided for DNA damage-induced RPA phosphorylation and ATM/ATR substrate phosphorylation (Fig. 1). These data showed MASTL accumulation in the presence of active DNA damage checkpoint signaling. To further strengthen this point, we have included quantification of fluorescent measurements (Fig. 2, supplemental 1 A-C). IF data showed MASTL upregulation in correlation with ATM/ATR activation. Time point information was also added for Ser-218 phosphorylation and MASTL-ENSA dissociation which were observed in early stages of the DNA damage response (1 hr, or 2-4 hr).