4,712 Matching Annotations
  1. Oct 2020
    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      We are grateful for the insightful, constructive and very positive reviews provide by the three reviewers. Please find responses to each of the reviewer comments below.


      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      The authors study proteins localised to the apical end of the highly polarised parasites causing Toxoplasmosis and malaria. They find new proteins using BioID and examine the localisation of these along with recently identified proteins in the two different parasites. They key question they address is whether there is a conservation of the apical components in these distantly related parasites as well as in some even more distantly related organisms. This is an important question as the apical part comprises many essential proteins of invasion of host cells and shows a unique structure that defines the apicomplexans as a group. The apical structure can be highly elaborate such as in T. gondii and less elaborate as in P. falciparum. The authors now show that there is a large conservation between the species in the protein makeup of the apical end. The experiments are well performed, displayed and discussed and there is no doubt about the validity of the presented results. The text is eloquently written, if at times a bit wordy.

      My only main suggestion would be to possibly add data on gene disruption of the two candidates (0310700 and 1216300) that are not detected in blood stage parasites but in the insect stages. A deletion of these should be technically straightforward and would show whether the proteins are important to the parasite. Likely not all of the now many proteins are essential for the parasites but these are good candidates to rapidly investigate. But showing a functional impact might convince editors at certain journals.

      Authors’ response: The central aim of this study was to ask if the molecular composition of the conoid complex is conserved across Apicomplexa. Functional dissection of proteins is part of an exciting set of subsequent questions and studies that will now follow by us and others. However, careful and thorough phenotyping of gene disruptions is not trivial work, would be most informative to perform in both Toxoplasma and Plasmodium, and is therefore beyond the scope of this project. Regarding the two proteins suggested by this reviewer for follow-up work and the question of ‘essentiality’, that the proteins have not been lost during parasite selection through evolution is clear evidence of their relevance to the biology of Plasmodium.

      Other suggestions in chronological order (line numbers would have helped)

      title: maybe write 'conoid complex proteome'

      Authors’ response: while we initially thought that this change would be suitable, given that the subsequent part of the title is ‘reveals a cryptic conoid feature’ we think it is clearer and more logical to leave this title in its original form. The conoid complex includes the apical polar rings, and these are not considered to be cryptic or previously unrecognised, only the conoid. While our study confirms that there is conservation across all proteome components of the conoid complex, this is secondary to the primary question of this study.

      abstract: not sure about the use of the words instrument and substructures

      Authors’ response: we believe that the use of ‘instrument’ is an appropriate analogy of a tool and not different from the use of ‘machine’ and ‘machinery’ that is widely used in molecular and cellular biology. Similarly, ‘substructure’ acknowledges that within recognised structures, such as the conoid, there is further specific organisation such as the conoid base or apex.

      page 2 last lines: is tubulin monomeric or polymerized?

      Authors’ response: to specify the polymerized state of tubulin as mentioned here the text has been changed to ‘the presence of tubulin polymers’.

      page 3 name protein talked about in 9th line

      Authors’ response: we have now named this protein (RNG2) as suggested.

      third paragraph: mention previous proteomics studies e.g. from Ke Hu (mentioned later in discussion)

      Authors’ response: We feel that it is more appropriate to leave the discussion of the Hu et al (2006) proteomics study, along with various subsequent approaches used in pursuit of discovering conoid-associated proteins, to the discussion as currently occurs. In the introduction we seek to efficiently inform the reader of the current state of knowledge that makes the value and nature of the questions that we have asked in this study apparent. But we do give full credit and evaluation of previous studies in the discussion which we think is the most appropriate place for this.

      first paragraph or results could go into introduction

      Authors’ response: The first paragraph of the Results contains specific detail of just one aspect of this study, the use of hyperLOPIT. This is relevant to the new analysis that we have made of the hyperLOPIT data in this study. We, therefore, believe that it is most appropriately presented here in the Results in association with the new analyses we described. Our aim is that the Introduction is succinct and serves the entire study.

      page 4: add reference after BioID

      Authors’ response: reference added as suggested

      page 5: add definitions of the conoid; what technique was used to report YFP-SAS6?

      Authors’ response: It is unclear what this reviewer is requesting with respect to definitions of the conoid on this page. Nevertheless, we have now included a thorough definition of the conoid based on the original electron microscopy studies (fourth paragraph of the Introduction).

      With respect to the technique used to report on YFP-tagged SAS6 in the de Leon et al 2013 study, we now include fuller description of this previous study as follows:

      ‘The fluorescence imaging used in the de Leon et al study was limited to lower resolution widefield microscopy. Immuno-TEM was also used, however, contrary to their conclusions, did show YFP presence throughout transverse and oblique sections of the conoid consistent with our detection of SAS6L throughout the conoid body.’

      page 7: 'showed similar localisation' instead of 'phenocopied'?; add reference after ookinete stage; add expression levels from PlasmoDB to the Table 1 data at least for merozoites, ookinetes and sporozoites or add separate table for the 9 proteins in supplement

      Authors’ response: ‘phenocopied’ replaced, as suggested. Reference added after ookinete stage, as suggested.

      As requested, we have complied available expression data for the Plasmodium proteins throughout the different zoite stages and will include these data as supplemental material in our subsequent revision.

      Discussion: Maybe discuss that the conoid complex is a cytoskeletal structure and that the other cytoskeletons (actin, microtubules, subpellicular network) also differ between the species investigated in their composition and overall architecture

      Authors’ response: These are reasonable suggested analogies and we will introduce them in the subsequent revision.

      page 9: at least two proteins could be deleted as they seem to not confer any growth defect on blood stages (see main comment)

      Authors’ response: This reviewer has not linked this comment to a specific statement on page 9, however, we are cautious not to interpret lack of observed growth defects in experimental scenarios with unimportant or irrelevant proteins. Maintenance, through natural selection and evolution, of proteins of a structure indicate that they are selectively advantageous and of functional relevance. The two proteins in question are not expressed in the blood stage, so one wouldn’t expect their deletion to have consequence in this stage.

      Apart from classic TEM images also Cryo EM data is available for apex of merozoite and sporozoite. Worth to discuss?

      Authors’ response: According to this review’s subsequent suggestion (below), we are now preparing a schematic for the subsequent revision of each of the zoite stages of Plasmodium and these draw on Cryo EM tomography data.

      Add and discuss the recent work from Curr Biol and EMBO J of the Yuan lab on ookinete formation?

      Authors’ response: These two reports are excellent studies of the polarised development of the cell pellicle during ookinete formation and control of gliding initiation, but don’t specifically related to the conoid complex structures that are the subject of our study. We, therefore, do not see a logical place to include discussion of these works.

      Reviewer #2 (Significance (Required)):

      The paper provides a conceptual advance over previous data as it shows clearly a high level of conservation of the protein components of the conoid complex. It could introduce a new terminology for these important apical structure of Apicomplexan parasites and provides a good basis to dissect the molecular functions.

      Authors’ response: We appreciate this reviewer recognising this opportune point in time to more clearly define the terminology applied to these apical structures so that they can be more clearly and easily compared between taxa. We will use the suggested schematic figure (see comment below) that is now in preparation as a basis and guide for a refined nomenclature based on precedent in the literature.

      As it stands all scientists investigating Plasmodium and Toxoplasma invasion of host cells will be highly interested in this study, most scientists researching apicomplexan organisms should be and some evolutionary scientists will be interested in this study.

      Key papers in the field are the discovery of the Toxoplasma conoid as a highly twisted microtubule-like structure (Hu et al., JCB 2002; doi: 10.1083/jcb.200112086) the first description of an apical proteome (Hu et al., PLoS Path 2006; 10.1371/journal.ppat.0020013), the description of a tilted arrangement of the rings in Plasmodium versus Toxoplasma (Kudryashev et al., Cell Microbiol 2012; doi: 10.1111/j.1462-5822.2012.01836.x) and the discovery of apical located proteins that are essential for conoid formation (Tosetti et al., eLife 2020; 10.7554/eLife.56635) to name a few.

      If intended for a broader audience, a cartoon of a conoid complex across the different species investigated and discussed here would help for visual guidance highlighting the similarities and differences

      Authors’ response: This is a good suggestion and we are presently preparing a schematic of all stages studied and supporting this with electron microscopy.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      In this work, Koreny et al. characterized the localization of a new collection of conoid proteins in Toxoplasma gondii as well as in several different stages of Plasmodium berghei. The authors discovered that these proteins are located in several distinct substructures in Plasmodium and are expressed in a stage-specific manner. The data are of high quality, well‐organized, and well presented. The paper is well written. The introduction, in particular, was a pleasure to read. This reviewer (Ke Hu) does not have any new experiments to suggest.

      However, while the authors present LOPIT+BIOID as a powerful approach to identify conoid proteins, implying that it is more reliable than previously published approaches (see below), the manuscript includes no data to show what the false positive or false negative rate is with the current approach, nor any estimate of how many conoid proteins were missed entirely.

      Authors’ response: In our validation of putative conoid-associated proteins identified by the hyperLOPIT+BioID approach we reporter-tagged 18 proteins to resolve their cellular location by microscopy. All 18 were verified as being located at the site of the conoid. So, by this measure there were no false positives. The veracity of the hyperLOPIT data was also confirmed across other cell compartments in our report where 62 proteins were reporter-tagged from which there were no false positive assignments of cell location (Barylyuk et al., 2020, Cell Host & Microbe, in press:doi:10.1016/j.chom.2020.09.011), bioRixv: https://doi.org/10.1101/2020 .04.23.057125).

      Estimating false negatives is more difficult, but we know that these would occur as for any mass spectrometry-based detection technique. However, we have not claimed to have been exhaustive, nor was this required to answer our central question of are there conserved conoid-associated proteins throughout Apicomplexa? To address this question, we required a good sample of proteins, and the methods that we have employed provided this.

      Page 7: "Previous identification of conoid complex proteins used methods including subcellular enrichment, correlation of mRNA expression, and proximity tagging (BioID) (Hu et al. 2006; Long, Anthony, et al. 2017; Long, Brown, et al. 2017). Amongst these datasets many components have been identified, although often with a high false positive rate. We have found the hyperLOPIT strategy to be a powerful approach for enriching in proteins specific to the apex of the cell, and BioID has further refined identification of proteins specific to the conoid complex region."

      The authors should state whether the candidate proteins were chosen in an unbiased way or not.

      Authors’ response: Candidate proteins selected for validation by microscopy were not biased for any known likelihood of being associated with the conoid, other than our proteomics data what we were seeking to test. However, we did preference proteins with the following traits, 1) proteins with strong corresponding gene knockout fitness phenotypes from published studies, 2) proteins with some evidence of conserved functional domains, and 3) genes with orthologues found in Plasmodium spp. and other apicomplexans. These traits were chosen with future functional studies in mind where proteins might be more informative of conoid-related functions and relevance in other apicomplexans. All validated proteins, however, were otherwise uncharacterised and, therefore, were not knowingly biased for more likely conoid-association over others discovered by our proteomics approach. We now include the following statement.

      “All proteins selected for validation were previously uncharacterised and with no a priori reason to be identified as conoid-associated other than our proteomics data.”

      If so, how many proteins were localized to the conoid and how many were not?

      Authors’ response: as stated above, we observed no false positives from the sample of 18 protein locations verified by microscopy.

      Related to this, the majority (14 out of 20) of the conoid proteins identified by LOPIT+BIOID in this paper were previously identified as conoid candidate proteins in Hu et al's 2006 paper, based on the number of peptides retrieved from the conoid enriched vs depleted fractions. Those data (see below) have been available from ToxoDB for many years and should be acknowledged.

      Accession# - conoid enriched : conoid depleted (from Hu et al. 2006)

      222350 - 2:0

      274120 - 3:0

      291880 - 1:0

      301420 - 3:1

      246720 - 4:0

      258090 - 10:0

      266630 - 8:1

      208340 - 4:2

      253600 - 1:0

      306350 - not found

      250840 - 1:0

      292120 - not found

      219070 - not found

      274160 - not found

      320030 - 7:1

      227000 - 10:0

      278780 - not found

      284620 - not found

      295420 - 6:0

      297180 - 4:0

      Authors’ response: Proteomic methods and mass spectrometry have experienced revolutionary advances since this 2006 study was conducted. These include improvements in both sensitivity and quantitation accuracy. The Hu et al 2006 study provided an exciting first step towards conoid protein discovery. However, by their original estimation, at least 35% of their putative conoid-specific proteins were identifiable as false positives (e.g. ribosomal proteins) and this estimate could not account for the majority of uncharacterised proteins whose potential for false positive attribution to the conoid was untested. From almost 300 proteins, this study only validated four as associated with the conoid. The further proteins listed above were not validated as conoid proteins in the Hu et al study and, therefore, could not be distinguished from the many false positives reported in their work. In our Table 1, we have acknowledged the Hu et al study for the select proteins that they established as conoid proteins in their study.

      To further assess the utility of this 2006 conoid-enriched proteome we sorted the Hu et al detected proteins on our full hyperLOPIT assignments. Of the proteins that were reported by Hu et al as either exclusive to the conoid-enriched fraction or enriched by at least 2-fold over the conoid-depleted fraction, 15% were assigned to the apical 1 and 2 clusters (representing the relevant compartments to the conoid complex). Thus, according to the hyperLOPIT data these represent the true positives found in this study and 13 of these proteins were independently validated as conoid-associated by us. Significantly, however, 85% of the conoid-exclusive and conoid-enriched proteins from Hu et al (2006) were allocated to a non-apical location with 99% probability by hyperLOPIT, and, during our validation of 62 assignments we verified the alternative location of eight of these. False positives, therefore, greatly outnumbered true positives in this earlier dataset. This high rate of false positives in subcellular isolation proteomics is typical of the challenges that this method faces, and this was the rationale for and strength of the alternative hyperLOPIT approach. Given the overall relatively low level of conoid specificity in the earlier work we do not think that there is value in making specific protein-by-protein reference to it.

      Reviewer #3 (Significance (Required)):

      see above

      Reviewer #4 (Evidence, reproducibility and clarity (Required)):

      This manuscript details the further use of the hyperplexed Localisation of Organelle Proteins by Isotope Tagging (hyperLOPIT) that the group has previous published using T. gondii tachyzoites by combining this with BioID and super-resolution microscopy in order to uncover new proteins that form part of a structurally known and functionally elusive conoid. The authors conclusively identified new proteins that localise to the conoid structure in T. gondii and also excitingly showed that not only is this structure found in all invasive forms of plasmodium (using the P. berghei model) but there also is a different molecular make up in the blood stage merozoites which have a slightly reduced number of proteins (or possible as yet unknown alternatives) compared to ookinetes and sporozoite conoid structures. This study is scientifically sound and the conclusions reached are well supported by the results presented.

      **Major Comments:** No major comments

      **Minor Comments:**

      1)While both the introduction and discussion and well written and detailed they could both be a little more concise.

      Authors’ response: We take this as a style recommendation, but we note that the other reviewers commented on the text’s “eloquence” and that the introduction in particular was a “pleasure to read”. We take these comments as votes of confidence in the current form.

      2)Selection of the 5 new genes in Tg to be tagged (top pg 5) it was not clear as to the selection criteria for these 5.

      Authors’ response: Please see the same query, and response with modified text, made by Reviewer #3.

      This also leads to the second part of this question where there appears to be some genes missing from Table 1 and Table S1, specifically those found in both SAS6L and RNG2 BioID. It was mentioned that 25 were identified in both SAS6L and RNG2 BioID. In Table 1 (there are 23) there is no mention of 223790, 281650, 224700, and 293540 but they are in the Table S1 (assuming these 4 are not selected in this study for tagging) but in table S1 (there are 25 listed) 216080 (AKMT) and 234250 (CIP1) that are in the Table 1 as being identified in both SAS6L and RNG2 BioID are absent from the Table S1 does this mean there are actually 27 or was the indication of identified in both SAS6L and RNG2 BioID for 216080 (AKMT) and 234250 (CIP1) in Table 1 a mistake?

      Authors’ response: This reviewer has overlooked that Table 1 reports on all currently known conoid associated proteins, including those not detected in the hyperLOPIT data but reported in the literature, whereas Table S1 is exclusively those proteins detected and assigned as ‘apical’ by hyperLOPIT. The reported BioID-detection for each protein is then made within this framework. Thus, the proteins that occur in only one or the other table do so because they don’t satisfy these two sets of criteria. We have rechecked the numbers reported in the text and they are correct.

      3)Table 1: There is the fitness score for Pf orthologues but no mention of fitness in Pb (the model used) from the PlasmoGEM screens, considering the authors use the Pb model it would be of interest to add this in the table.

      Authors’ response: The Plasmodium berghei PlasmoGEM gene disruption screen were much more limited in number than that for P. falciparum. Consequently, fitness scores were available for only two of the Plasmodium orthologues for which we have location data. We, therefore, thought it was of limited utility to include these data in Table 1, and these data are in the public domain should a reader seek them.

      4)Figure 2: The image for localisation with SAS6L for 291880 and 258090 appear to be missing.

      Authors’ response: Initially we did not make the separate transgenic cell lines for each protein with both the SAS6L and RNG2 markers. This was because one marker was usually sufficient to resolve the relative location of the protein of interest. However, given this reviewer’s comment and the potential for some extra information to be recovered by using both markers, we have now generated all cell lines necessary for this analysis. We are presently completing the imaging of these new cell lines and these data will be included in the subsequent revision.

      5)Figure 3: It is unclear why both SAS6L and RNG2 are not used for all localisations shown (this could be clarified in the text)

      Authors’ response: see previous comment.

      6)Figure 5: It is a shame only 7 of the 9 plasmodium orthologues were included in the super resolution as there is only 2 more to have the complete set.

      Authors’ response: Ideally, we would have been able to achieve this but, the restrictions imposed by the COVID-19 disruption to laboratory access and activities ultimately slightly limited these analyses. However, to answer the central question of whether there is conservation of the Toxoplasma conoid proteome in Plasmodium it was not necessary to perform super resolution imaging for all of these proteins. The major outcome of this study, therefore, is not affected by this.

      7)Figure 6: As with Figure 5 it would be better if more were included in the super-resolution images in this sporozoite stage.

      Authors’ response: Same response as above. Generation of sporozoites requires passage through the mosquito vector so this is even more resource-intensive than generation of ookinetes that can be differentiated in vitro from mouse-derived parasites. Again, the answers to the central questions posed by this study do not require these further, high resolution, data.

      8)Figure 7: This would be improved with at least a selection (or even all 6) to have the super-resolution images (possibly even with free merozoites)

      Authors’ response: We did apply 3D-SIM imaging to fixed merozoites, however, unlike ookinetes and sporozoites, the imaged fixed material was inferior to the live cell GFP imaging that we have included. This likely reflects the poorer fixation properties of Plasmodium merozoites that is a challenge of these cell forms that is widely experienced by Plasmodium researchers. We do not have access to a 3D-SIM microscope within a containment laboratory necessary for handling viable parasites, therefore, could not attempt to image live material with this instrument. Again, the answers to the central questions posed by this study do not require these further, high resolution, data

      9)As there are numerous new protein identified in 2 different parasites and with the composition of the conoid differing at different stages it would be beneficial to have some sort of schematic model of the apical complex in Tg and Pb indicating where each new protein localises

      Authors’ response: In response to this reviewer, and reviewer #2’s suggestion, we are now preparing schematic models of the apices of all of the relevant organism stages.

      Reviewer #4 (Significance (Required)):

      The authors have combined expert mass spectrometry and super-resolution microscopy to identify new components of the conoid in Tg and added to the knowledge that will help to uncover the function of the structure. But perhaps the most significant is the conclusive identification of the conoid in all 3 invasive stages of the plasmodium parasite. Until now it was widely accepted that the conoid was missing in plasmodium and to uncover multiple proteins that appear to make up and constitute this structure in Plasmodium is highly significant and clear of interest to the Apicomplexean field. Furthermore the suggestion that the conoid differs in the molecular makeup within Plasmodium depending on stage is very intriguing and clearly of interest. This paper expertly combined cutting-edge proteomic and microscopy to identify the conoid in Plasmodium. This manuscript would have a broad readership in parasitology, proteomics, and cell biology

      Our expertise is largely in molecular parasitology and microscopy

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      This manuscript details the further use of the hyperplexed Localisation of Organelle Proteins by Isotope Tagging (hyperLOPIT) that the group has previous published using T. gondii tachyzoites by combining this with BioID and super-resolution microscopy in order to uncover new proteins that form part of a structurally known and functionally elusive conoid. The authors conclusively identified new proteins that localise to the conoid structure in T. gondii and also excitingly showed that not only is this structure found in all invasive forms of plasmodium (using the P. berghei model) but there also is a different molecular make up in the blood stage merozoites which have a slightly reduced number of proteins (or possible as yet unknown alternatives) compared to ookinetes and sporozoite conoid structures. This study is scientifically sound and the conclusions reached are well supported by the results presented.

      Major Comments: No major comments

      Minor Comments:

      1)While both the introduction and discussion and well written and detailed they could both be a little more concise.

      2)Selection of the 5 new genes in Tg to be tagged (top pg 5) it was not clear as to the selection criteria for these 5. This also leads to the second part of this question where there appears to be some genes missing from Table 1 and Table S1, specifically those found in both SAS6L and RNG2 BioID. It was mentioned that 25 were identified in both SAS6L and RNG2 BioID. In Table 1 (there are 23) there is no mention of 223790, 281650, 224700, and 293540 but they are in the Table S1 (assuming these 4 are not selected in this study for tagging) but in table S1 (there are 25 listed) 216080 (AKMT) and 234250 (CIP1) that are in the Table 1 as being identified in both SAS6L and RNG2 BioID are absent from the Table S1 does this mean there are actually 27 or was the indication of identified in both SAS6L and RNG2 BioID for 216080 (AKMT) and 234250 (CIP1) in Table 1 a mistake?

      3)Table 1: There is the fitness score for Pf orthologues but no mention of fitness in Pb (the model used) from the PlasmoGEM screens, considering the authors use the Pb model it would be of interest to add this in the table.

      4)Figure 2: The image for localisation with SAS6L for 291880 and 258090 appear to be missing.

      5)Figure 3: It is unclear why both SAS6L and RNG2 are not used for all localisations shown (this could be clarified in the text)

      6)Figure 5: It is a shame only 7 of the 9 plasmodium orthologues were included in the super resolution as there is only 2 more to have the complete set.

      7)Figure 6: As with Figure 5 it would be better if more were included in the super-resolution images in this sporozoite stage.

      8)Figure 7: This would be improved with at least a selection (or even all 6) to have the super-resolution images (possibly even with free merozoites)

      9)As there are numerous new protein identified in 2 different parasites and with the composition of the conoid differing at different stages it would be beneficial to have some sort of schematic model of the apical complex in Tg and Pb indicating where each new protein localises

      Significance

      The authors have combined expert mass spectrometry and super-resolution microscopy to identify new components of the conoid in Tg and added to the knowledge that will help to uncover the function of the structure. But perhaps the most significant is the conclusive identification of the conoid in all 3 invasive stages of the plasmodium parasite. Until now it was widely accepted that the conoid was missing in plasmodium and to uncover multiple proteins that appear to make up and constitute this structure in Plasmodium is highly significant and clear of interest to the Apicomplexean field. Furthermore the suggestion that the conoid differs in the molecular makeup within Plasmodium depending on stage is very intriguing and clearly of interest. This paper expertly combined cutting-edge proteomic and microscopy to identify the conoid in Plasmodium. This manuscript would have a broad readership in parasitology, proteomics, and cell biology

      Our expertise is largely in molecular parasitology and microscopy

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      In this work, Koreny et al. characterized the localization of a new collection of conoid proteins in Toxoplasma gondii as well as in several different stages of Plasmodium berghei. The authors discovered that these proteins are located in several distinct substructures in Plasmodium and are expressed in a stage-specific manner. The data are of high quality, well‐organized, and well presented. The paper is well written. The introduction, in particular, was a pleasure to read. This reviewer (Ke Hu) does not have any new experiments to suggest.

      However, while the authors present LOPIT+BIOID as a powerful approach to identify conoid proteins, implying that it is more reliable than previously published approaches (see below), the manuscript includes no data to show what the false positive or false negative rate is with the current approach, nor any estimate of how many conoid proteins were missed entirely.

      Page 7: "Previous identification of conoid complex proteins used methods including subcellular enrichment, correlation of mRNA expression, and proximity tagging (BioID) (Hu et al. 2006; Long, Anthony, et al. 2017; Long, Brown, et al. 2017). Amongst these datasets many components have been identified, although often with a high false positive rate. We have found the hyperLOPIT strategy to be a powerful approach for enriching in proteins specific to the apex of the cell, and BioID has further refined identification of proteins specific to the conoid complex region."

      The authors should state whether the candidate proteins were chosen in an unbiased way or not. If so, how many proteins were localized to the conoid and how many were not? Related to this, the majority (14 out of 20) of the conoid proteins identified by LOPIT+BIOID in this paper were previously identified as conoid candidate proteins in Hu et al's 2006 paper, based on the number of peptides retrieved from the conoid enriched vs depleted fractions. Those data (see below) have been available from ToxoDB for many years and should be acknowledged.

      Accession# - conoid enriched : conoid depleted (from Hu et al. 2006)

      222350 - 2:0

      274120 - 3:0

      291880 - 1:0

      301420 - 3:1

      246720 - 4:0

      258090 - 10:0

      266630 - 8:1

      208340 - 4:2

      253600 - 1:0

      306350 - not found

      250840 - 1:0

      292120 - not found

      219070 - not found

      274160 - not found

      320030 - 7:1

      227000 - 10:0

      278780 - not found

      284620 - not found

      295420 - 6:0

      297180 - 4:0

      Significance

      see above

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      The authors study proteins localised to the apical end of the highly polarised parasites causing Toxoplasmosis and malaria. They find new proteins using BioID and examine the localisation of these along with recently identified proteins in the two different parasites. They key question they address is whether there is a conservation of the apical components in these distantly related parasites as well as in some even more distantly related organisms. This is an important question as the apical part comprises many essential proteins of invasion of host cells and shows a unique structure that defines the apicomplexans as a group. The apical structure can be highly elaborate such as in T. gondii and less elaborate as in P. falciparum. The authors now show that there is a large conservation between the species in the protein makeup of the apical end. The experiments are well performed, displayed and discussed and there is no doubt about the validity of the presented results. The text is eloquently written, if at times a bit wordy. My only main suggestion would be to possibly add data on gene disruption of the two candidates (0310700 and 1216300) that are not detected in blood stage parasites but in the insect stages. A deletion of these should be technically straightforward and would show whether the proteins are important to the parasite. Likely not all of the now many proteins are essential for the parasites but these are good candidates to rapidly investigate. But showing a functional impact might convince editors at certain journals.

      Other suggestions in chronological order (line numbers would have helped)

      title: maybe write 'conoid complex proteome'

      abstract: not sure about the use of the words instrument and substructures

      page 2 last lines: is tubulin monomeric or polymerized?

      page 3 name protein talked about in 9th line

      third paragraph: mention previous proteomics studies e.g. from Ke Hu (mentioned later in discussion)

      first paragraph or results could go into introduction

      page 4: add reference after BioID

      page 5: add definitions of the conoid; what technique was used to report YFP-SAS6?

      page 7: 'showed similar localisation' instead of 'phenocopied'?; add reference after ookinete stage; add expression levels from PlasmoDB to the Table 1 data at least for merozoites, ookinetes and sporozoites or add separate table for the 9 proteins in supplement

      Discussion: Maybe discuss that the conoid complex is a cytoskeletal structure and that the other cytoskeletons (actin, microtubules, subpellicular network) also differ between the species investigated in their composition and overall architecture

      page 9: at least two proteins could be deleted as they seem to not confer any growth defect on blood stages (see main comment)

      Apart from classic TEM images also Cryo EM data is available for apex of merozoite and sporozoite. Worth to discuss?

      Add and discuss the recent work from Curr Biol and EMBO J of the Yuan lab on ookinete formation?

      Significance

      The paper provides a conceptual advance over previous data as it shows clearly a high level of conservation of the protein components of the conoid complex. It could introduce a new terminology for these important apical structure of Apicomplexan parasites and provides a good basis to dissect the molecular functions. As it stands all scientists investigating Plasmodium and Toxoplasma invasion of host cells will be highly interested in this study, most scientists researching apicomplexan organisms should be and some evolutionary scientists will be interested in this study.

      Key papers in the field are the discovery of the Toxoplasma conoid as a highly twisted microtubule-like structure (Hu et al., JCB 2002; doi: 10.1083/jcb.200112086) the first description of an apical proteome (Hu et al., PLoS Path 2006; 10.1371/journal.ppat.0020013), the description of a tilted arrangement of the rings in Plasmodium versus Toxoplasma (Kudryashev et al., Cell Microbiol 2012; doi: 10.1111/j.1462-5822.2012.01836.x) and the discovery of apical located proteins that are essential for conoid formation (Tosetti et al., eLife 2020; 10.7554/eLife.56635) to name a few.

      If intended for a broader audience, a cartoon of a conoid complex across the different species investigated and discussed here would help for visual guidance highlighting the similarities and differences

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Point-by-point response to reviewers

      Reviewer #1 (Evidence, reproducibility and clarity (Required)): **

      The authors constructed a virtually complete fitness landscape of the P1 extension region (4-base-paired helix) in the group I intron from Tetrahymena thermophila, using a kanamycin resistance reporter to evaluate the fold-change in fitness, which is related to self-splicing activity. This was a clever choice of system because it was known from earlier work that the P1 extension adopts two different conformations during self-splicing. The fitness of each variant was determined from the number of reads acquired from the sequencing data sets and analyzed through an extensive computational pipeline. The strength of the paper is that this machine learning approach can be used to calculate how individual variants contribute to the fitness landscape and assess the directions of epistasis across a large number of identified genotypes.

      We thank the reviewer for highlighting one of the key strengths of our manuscript, the fact that our analytical approach, using SHAP values, enables contributions of individual variants to be assessed in a genotype-specific manner. This approach provides for a sound, robust, and principled way of describing and understanding the fitness impact of one mutation in the context of (potentially many) others.

      The authors argue that machine learning more successfully models subtle effects that arise from interactions between RNA residues, and that the power to analyze deep mutational sequencing experiments can better rationalize fitness constraints arising from multiple conformational states.

      We do indeed argue that machine learning is likely to play an increasing role in making sense of deep mutational scanning data. These scans provide high-resolution information on how fitness maps onto genotype, but the molecular underpinnings of this relationship often remain obscure. It is these “hidden” underpinnings, including the effects of specific mutations on RNA/protein folding, structures, and dynamics, that machine learning approaches can help elucidate.

      The results are mostly consistent with previous studies even though the authors collected the data in a more advanced and complicated way. They are also able to rationalize complex phenotypes - for example, the observed fitness defects are more prevalent under an unfavorable growth condition (30ºC), because the lower temperature hinders conformational exchange. Although such cold sensitive effects are well known in RNA, it is gratifying that this can be captured in the fitness landscape.

      Finding temperature-related fitness effects that are consistent with impaired conformational exchange was also gratifying for us and we thank the reviewer for highlighting this finding.

      The results would be more convincing if the authors directly measure the self-splicing activity of a few key variants, such as the C2C21 mutant, to determine whether these mutations alter the self-splicing mechanism of the Tte-119(C20A) master sequence in the way that they infer from their model. In interpreting their results, they may want to consider misfolding of the intron core (coupled to base pairing of P1) and reverse self-splicing. Reversibility in the hairpin ribozyme, for example, turned out to be the key for understanding the effects of certain mutations.

      We appreciate that measurements of splicing activity for individual genotypes would complement and further strengthen our study. We will therefore aim to construct strains for a few key genotypes and assay self-splicing activity using RT-qPCR – an approach we previously used successfully to monitor splicing kinetics of self-splicing introns in yeast mitochondria (see Rudan et al. 2018 eLife 7:e35330). Specifically, we will quantify the fraction of spliced and unspliced transcripts using primers that span the exon-exon and the 3’ exon-intron junction, respectively (the 5’ intron-exon junction is genotypically diverse and would require genotype-specific primers). This will be done under non-selective (-kan) conditions, where the relative fraction of spliced and unspliced transcripts is a function of intrinsic splicing ability and not confounded by selection. We aim to include the master sequence, C2C21, G3C20 and its mirror genotype C3G20, U3 (which restores perfect complementarity in the master sequence), and G5 (inferred from the high-throughput experiment to make a strong negative contribution to fitness).

      In interpreting our results, we will consider different mechanisms of splicing failure, such as kinetic problems (slow dissociation of P1ex), misfolding of the intron core, reverse self-splicing, and the use of cryptic splice sites, which has previously been documented (see e.g. Woodson & Cech 1991 Biochemistry 30:2042-2050). We note, however that a precise mechanistic dissection of the splicing defects of individual variants is not the purpose of this manuscript and we therefore do not aim to establish genotype-specific defects in great molecular detail.

      Related to the point above, interesting conclusions regarding the relationships between base identity and epistasis that arise from metastability should be strengthened with additional examples. For example, the authors can explain why a reverse base-pairing variant (C3G20) exhibits negative epistasis but is not similar to that of the G3C20 construct. This would ideally use the data from the screen but also be validated by checking the self-splicing activity of a few individuals at low and high temperature.

      In measuring splicing activity and its link to fitness for a subset of key variants (see point #4), we will include at least one mirror example such as C3G20/G3C20. In addition, we will highlight additional examples of this mirror asymmetry based on the results from our high-throughput screen.

      They should validate the screen by showing that kanamycin resistance does indeed correlate strictly with self-splicing activity, and not some other feature such as RNA turnover. (It would also not be a bad idea to check this in the cell, which can be done by primer extension or Northern blotting.)

      This question (i.e. whether altered RNA stability rather than splicing efficiency explains differential KNT production and ultimately fitness) has previously been addressed by Guo & Cech (2002) when introducing the knt+intron reporter system. These authors found no difference in mRNA stability in constructs that displayed differential kanamycin resistance. To shore up this conclusion further, we will measure fitness (via colony counts, growth rate or more directly through competitive fitness assays) of the key variants for which we determine splicing activity (see point #4) and then correlate splicing and fitness.

      The benefit of the machine learning model is that it can extract signals that may be hard to detect otherwise. The downside is that it doesn't produce a physical model, as far as I am aware. The parameters are themselves not meaningful - except to the degree that trends in the fitness estimates can be explained after the fact. This is something that should ideally be explained more directly in the manuscript.

      The reviewer raises an interesting point, that indeed deserves further discussion/explanation. The reviewer is right that, at first sight, high-resolution fitness landscapes like ours do not directly produce a physical (structural) model of the molecule under investigation. They connect genotype and fitness, but the molecular intermediate – a biophysical structure – is not explicit. However, over the last few years, it has become apparent that deep mutational scanning experiments can – both in principle and in practice – yield information that can be leveraged to infer such a physical model. In short, covariation in fitness between residues in a protein or bases in an RNA can be used as inputs for constraint-based modelling of physical interactions. Notably, Schmiedel & Lehner (2019, Nature Genetics 51: 1177-1186) recently demonstrated that deep mutational scanning data can be used in this manner to reconstruct secondary and tertiary protein structure with high accuracy. In principle, the same approach can be used to reconstruct RNA structures. This will require more extensive, molecule-wide fitness data, but our study points towards just this future, even for data collected from structural ensembles.

      When we stated in the original manuscript that deconvolution of the fitness landscape might help to reverse engineer structures, this ability to interpolate between genotype and fitness to reveal hidden biophysical/structural relationships is what we refer to. We will revise the manuscript to make this connection more explicit.

      The authors claim that by evaluating a large number of sequences at two conditions, they can capture variants with intermediate phenotypes (Fig. 1). This is not necessarily true. If the original screen allows only the most active variants to survive on kan+ medium, then the signature of intermediate phenotypes may not be encoded in the original data, and thus not retrievable even with sophisticated algorithms, which may also be prone to overfitting. At what limit of stringency will the screen fail to yield information about intermediate fitness? How deeply must one sequence to recover this information, especially if noisy or degraded? Some discussion of these effects would be helpful.

      The capacity of any high-throughput sequencing-based DMS experiment to resolve intermediate phenotypes does indeed depend on a number of things. The reviewer highlights two of these: First, in screens where the phenotype is not binary (dead/alive) but fitness can be measured on a continuous scale, can we – and do we – capture phenotypes with intermediate fitness? What if only the fittest/most active variants survive? This is, ultimately, an empirical question, and one we can answer quite definitively: we do observe a large range of intermediate phenotypes, which – in our study – correspond to intermediate fold-change values. For each genotype, we can provide confidence limits and assess statistical significance. Table S1 provides this information. Our capacity to resolve these intermediate phenotypes is mainly based on three things. One is adequate sequencing depth, as highlighted by the reviewer. The second is the number of biological replicates (N=6) we analyse, which allows us to differentiate biological variability from noise for a large number of genotypes. This is an important aspect of DMS experiments that has often been overlooked (i.e. there are many other studies where only a single replicate is analysed and biological heterogeneity is not taken into account). With six replicates in hand, we can directly estimate variability (as done e.g. in our DESeq2 analysis) and quantify uncertainty so as to guard against overfitting. In our view, this is arguably more important than sequencing depth in deriving appropriate fitness estimates. Finally, we can resolve intermediate phenotypes because we keep the time lag between initial exposure to kanamycin and assaying genotype frequencies relatively short (overnight growth, see Methods). Our experiment is effectively a multi-genotype competition experiment, and we provide a snapshot across the genotype pool at a given time. If we had measured after several days of culture, genotypes with greater relative fitness would have spread further through the population, at the cost of less fit genotypes, many of which would likely have been eliminated. We kept measurement lag relatively short on purpose so that we could see a clear differential response to kanamycin while still being able to catch more than just a handful of the very fittest genotypes.

      In light of the above, it will be apparent that there are no simple answers to the reviewer’s questions about required sequencing depth, levels of stringency, etc. The ability to assign differential fitness across a large population of genotypes hinges on multiple interrelated considerations (sequencing depth, complexity of the final & starting pool, number of replicates). In revising the manuscript, we will highlight some of the key considerations just discussed, bearing in mind that the manuscript cannot possibly discuss all possible pitfalls and requirements of deep mutational scanning experiments in great detail.

      Lastly, the evolvability of RNA is fascinating and there is much to learn. However, the authors don't discuss the implications of their findings for molecular evolution although they throw the term around. It would be exciting if there is a trend in the fitness landscape that could help explain the trajectory of RNA evolution in nature.

      We agree with the reviewer that it would be exciting to link deep mutational scanning results more closely with observable patterns of RNA evolution. This is true both in relation to evolution of P1ex/group I introns specifically and evolution of dynamic RNA structures more generally. Regarding the latter, we note that selection against excess stability has previously been inferred for 5’ UTRs (see e.g. Gu et al. 2010 PLoS Comp Biol 6: e1000664), although our case is slightly different in that a helix still needs to form but be sufficiently unstable to enable swift dissociation. We also note that riboswitches might make for an excellent subject to study asymmetric constraint and selection against excess stability as they involve formation of competing helices (including participation of some but not all nucleotides in more than one helix), their structure/function is well understood, and many examples are known, providing opportunities for evolutionary analysis. We consider this outside the scope of the current study. We will, however, seek to analyse patterns of evolution in P1ex to establish whether they correspond in a meaningful way to the fitness trends we observe in the laboratory. To do so, we will analyse the distribution and evolutionary history of variants across orthologous introns in different Tetrahymena species/strains, with a focus on P1ex, P10 and the surrounding sequence. Fortunately for us, the 23S ribosomal RNA gene in which the intron is embedded has been used as a phylogenetic marker so that intron/exon sequence information is available for a reasonable number of species/strains (see Doerder 2018 J Eukaryot Microbiol 66:182-208). We will generate an alignment of these sequences and ask, for example, whether N2-N5 are subject to different constraints than N18-N21 mirroring our experimental findings. We have previously successfully quantified patterns of variation surrounding self-splicing introns in yeast mitochondria (Repar & Warnecke 2017 Genetics 205:1641-1648). Note here that extending this analysis beyond Tetrahymena is problematic. Specifically, the intron is absent from close relatives of Tetrahymena (Doerder 2018 J Eukaryot Microbiol 66:182-208) and P1-proximal structures of distant relatives are quite variable. In addition, we are looking at intronic regions that are not only adjacent to but also directly interact with exonic sequence. The exonic context in which the intron is embedded therefore matters but will be quite different for more distant group I introns. We therefore think that aligning and comparing distant orthologs has limited merit.

      The authors use the abbreviation DMS for deep mutational scanning; the RNA structure field uses the reagent dimethylsulfate that is also abbreviated DMS. They may want to choose a different acronym or just avoid an acronym altogether.

      We appreciate this point about false-friend acronyms. We will either find a different acronym or avoid it altogether.

      Reviewer #1 (Significance (Required)):

      As the importance of RNA structure for gene expression becomes more widely appreciated, interest in understanding the evolution of RNA structures is also increasing. Compared with the molecular evolution of proteins, evolution and fitness in RNA is far less understood, although the authors appropriately point to a number of recent studies on this topic. The main advance here is to use machine learning methods to analyze the results of a large genotypic screen, with the goal of more accurately capturing the fitness effects of sequences at varied distances from the parental sequence. The specific conclusions reached here such as the importance of metastability or the prominence of cold sensitive effects are not revolutionary, but the authors illustrate how such phenomena can be investigated more systematically and in more depth.

      We thank the reviewer for highlighting that our analytical approach showcases how deep mutational scanning data can be analysed in an unbiased and systematic manner to better understand the relationship between genotype, molecular phenotype (e.g. structure), and fitness. The reviewer also rightly points to specific results we obtain regarding temperature-related effects and metastability of P1ex/P10. However, we believe that the most important contribution of this work is a more general one, namely our proof-of-principle demonstration that deep mutational scanning data can capture multiple conformational states simultaneously, and that these states can be deconvoluted from a single fitness landscape to attribute the fitness impact of individual mutations to specific RNA conformations. To our knowledge this had not been explicitly demonstrated before and our work provides an important cornerstone for future studies looking to interpret mutational effects in either RNAs or proteins in the light of dynamic structures.

      In light of comments by reviewer #2 below, it is worth reiterating the proof-of-principle nature of this study. Many of the specific results we obtain (e.g. importance of avoiding excess stability in P1ex) are not revolutionary. Indeed, we would be worried if they were. We chose to investigate P1ex because substantial prior work exists that has furnished us with solid positive controls. This independent prior validation allows us to both have great confidence in the data we generate and demonstrate cogently that the two conformational states at the beginning and end of the splicing reaction are captured in the data.

      Finally, we believe our work, in covering a virtually complete genotype space, using multiple replicates to quantify uncertainty in fitness estimates, and using SHAP scores to interpret variant effects in genotype-specific context, sets a new high bar for this type of study and will provide valuable reference data and analytical recipes for future analyses. **

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      The manuscript by Soo et al probes the effect of mutations on the fitness of the Tetrahymena Group I self-splicing intron. They used high-throughput sequencing to simultaneously identify the effect of every possible sequence in a 4-bp helix. The approach is sound and the conclusions are generally supported. However, the analysis seems overly complicated given the dataset. Both the analysis and the accompanying writing make it difficult to understand what seems to be a fairly clear conclusion - that the relative stabilities of two alternative RNA helices are important for splicing.

      We thank the reviewer for testifying to the validity of our approach and the soundness of our conclusions. Regarding the complexity of the analysis, the reviewer is right in that – for the conclusion that the relative stabilities of two alternative helices are important for fitness – a simpler analysis would have sufficed. However, as elaborated in response to point #11 above, our objective here is not merely to draw specific conclusions about the relative stabilities of P1ex and P10, but more general: a) to demonstrate that a single fitness landscape can be deconvoluted to implicate multiple conformations in fitness defects and b) to provide a basic but powerful recipe for doing so in an unbiased, systematic manner using machine learning.

      We will strive to make the writing clearer so that readers can follow this reasoning and appreciate our analytical choices.

      • **Major Comments** *

      The authors state that this method can identify the impact of transient conformational states. However, the two conformational states in this study are not transient - in fact they are associated with two distinct chemical steps of splicing and are quite stable. It may be that the effect of important transient states would be observed, but this study does not demonstrate that.

      We used the word “transient” to describe two alternative RNA structures formed during the life cycle of the intron. Both states (characterized by P1ex and P10 formation) are transient in as much as they disappear as splicing proceeds. In retrospect, we agree with the reviewer that this usage is too loose (given how the term is generally used in the literature) and might evoke the wrong connotations. We will therefore revise the manuscript to eliminate references to P1ex and P10 as transient states, but rather describe them as alternative conformations. Of course, the general point remains true: that deep mutational scanning data should in principle capture all fitness-relevant structural states even if these are transient (in the strict sense of the word).

      "Fitness" ends up being on an arbitrary scale, which impairs some analysis. A similar high-throughput sequencing pipeline could have been used to directly monitor splicing of every mutant, though at this point that is outside the scope of this study. Even with the arbitrary units, it would be clearer if more time were spent comparing fitness to base-pair stability on an individual basis, rather than the broad analyses. (See minor comments for details.)

      The reviewer is right in saying that a high-throughput pipeline could have been designed to monitor splicing of each genotype directly (rather than assaying fitness of the cell population that represents a particular genotype).We chose not to do so. One reason for this is that monitoring splicing directly would have necessitated design of a more complicated assay. This is because, to monitor splicing efficiency, one would have to monitor both pre-mRNA and mRNA for different genotypes. The former is straightforward (using primers that span the exon-intron junction) but the latter is not: successful splicing removes the genotype-specific information from the mRNA (that information being solely encoded in the intron). This a solvable problem in principle. One might, for example, introduce barcodes of sufficient complexity in the mRNA that can be linked back to the intron genotype, but doing so would have introduced a further source of error and complicated analysis. We therefore opted for monitoring genotypic fitness by sequencing the plasmids from which the RNAs originate. This does mean that our measurements of fitness are not coupled to a specific molecular phenotype (such as splicing efficiency) – we presume (but are not entirely sure) this is what the reviewer refers to when talking about fitness being on an “arbitrary scale”. However, fitness derived in this manner has the advantage of providing information that does not start from a mechanistic preconception. We ask how variant affects survival and reproduction of the cell without presuming specific mechanism and the results can therefore capture any mechanism, including those that we did not consider initially. The challenge then becomes to tease out possibly multiple mechanisms from unbiased data.

      We will tackle the reviewer’s final comment, regarding analysis of base-pair stability, below in response to one of the minor comments (point #20).

      \*Minor Comments** *

      The sentence in the abstract beginning "Using an in vivo report system..." is very difficult to comprehend. This is due both to the length of the sentence and the word usage. The final sentence of the abstract is similarly difficult. In general, the writing overemphasizes complexity at the cost of clarity.

      We will revise the entire manuscript to make the writing both clearer and more concise.

      Analysis of results in terms of "epistasis" obscures what could be a straightforward observation. This is the same as saying that mutants are not independent, or that their energetic costs are not additive. This follows obviously from the observation that the nucleotides being mutated are base-paired.

      Making explicit reference to “epistasis” is a considered choice. Framing results in terms of epistasis might be less familiar to readers grounded in RNA or protein biophysics/biochemistry, but is very much at the heart of thinking about the genotype-phenotype relationship from an evolutionary perspective, where global descriptions of epistasis are commonplace and usually provide the starting point for thinking about genotype-phenotype relationships, evolution and evolvability. So what seems unnecessarily obscure when seen through the lens of one field, is natural when considered in the context of another. Importantly, it is also the central approach adopted by many if not most prior deep mutational scanning studies (see e.g. Hayden et al. 2011; Pressman et al. 2019; Zhang et al. 2009; Li et al. 2016; Puchta et al. 2016; Domingo et al. 2018; Li and Zhang 2018; Weinreich et al. 2013; Lalić and Elena 2015; Bendixsen et al. 2017 as cited on page 3 of the manuscript) so we think this framing is helpful to compare our results to prior work.

      We expect that the readership will include many researchers interest in mapping genotype-phenotype-fitness relationships who will expect to see global analyses and descriptors of the type we present. We will, however, revise the manuscript to ensure that our description of the findings remains accessible to readers from other fields.

      More specifically, we also note that the fact that mutations are not independent (i.e. epistasis exists) might be trivial from the fact that P1ex is a base-paired helix. The magnitude and direction (“sign”) of epistasis, however, are not. In fact, as we describe, contrary to prior DMS on RNA helices, we find a lot of positive epistasis, reflecting, as we argue, selection against excess stability of P1ex to allow subsequent formation of P10.

      The novel information is the sensitivity of fitness to base pairing. This is best shown in an analysis like Figure 3A (see below), not broad measures of epistasis.

      Please see responses to points #11, #12, and #16 above for an elaboration of what we consider to be the main merits of this study and why providing broad measures of epistasis is a sensible choice.

      Figure 1C isn't necessary for the reader to understand the process.

      We are happy to follow editorial guidance as to whether this panel is superfluous and should be removed or is worth including.

      It is unclear what figure 2C is showing. It appears that the replicates are similar to each other, that 30 deg C and 37 deg C are also similar, but that +/- Kan are different. This probably doesn't need a figure in the main text.

      This figure does indeed capture what the reviewer describes: genotype pools in +/-kan are least similar to each other, while 30/37ºC are similar but distinct in the +kan condition and effectively indistinguishable in the -kan condition, in line with expectations. We agree with the reviewer that this information per se is something that would typically be found in a supplementary figure. However, we would advocate for retention of this panel in the main manuscript in this instance because of the way in which it was derived: using the Bray-Curtis dissimilarity index. To our knowledge, this is the first time that Bray-Curtis dissimilarity has been used to quantify, in a principled way, the similarity between genotype pools. Borrowed from the ecology literature, the index captures both richness (number of different species/genotypes in the ecosystem/genotype pool) and relative abundance to provide an integrated measure of genotype diversity. We believe that this measure will be useful for future studies and rather than relegating the figure to the supplement, we would aim to briefly highlight its methodological novelty. *

      *

      Figure 3A could be the most informative part of the manuscript. However, predicted minimum free energy should be on the x-axis as the independent variable. The expectation then is that you would see a peak in fitness at some free energy, with fitness falling off both with increased and decreased stability. Furthermore, there should be more analysis along these lines. The authors should calculate helical stability for both P1ex and P10 for every mutant and compare with fitness. Mutations which affect both could also be separated out. Figure 4C comes the closest to this but views it only in terms of GC pairs; there is no reason not to quantify the energetic effects given that predictions of stability for helices is quite good. Deviations from a model invoking only helical stabilities would indicate another factor is involved (alternative base-pairing or tertiary structure, for example).

      We agree with the reviewer that the axes in Figure 3A should be flipped and we will do so in the revised manuscript. We also agree that, when it comes to helical stability of P1ex, the simple expectation would be to see a peak at a certain stability with drop-offs either side, as intimated by Figure 4C. We further agree with the reviewer that Figure 4C is rather indirect and can be made more quantitative by considering helical stability across all genotypes directly. To this end, we will use one of the many tools available that allow prediction of helical stability from primary sequence (e.g. the enf2 function in RNAStructure, as used by Torgerson et al 2018 RNA, see point #24 below) and replace Figure 4C with a more quantitative fitness landscape based on these computations. To provide added confidence in the computations of helical stabilities from primary sequence in the context of our structure, we will also calculate helical stabilities from molecular dynamics simulations for the subset of genotypes we considered previously (Figure 4E/F) and see how inferred stabilities compare.

      There appears to be a missing verb in the legend for figure 3A, second sentence.

      We will fix this error.

      Figure S5 appears to be redundant with Figure 1.

      At first glance, Figure S5 does indeed appear redundant with Figure 1 but it is not. Figure S5 shows the relevant sequence of the group I intron and bordering exons in its native context, i.e. when embedded in the 23S ribosomal RNA gene of Tetrahymena thermophila, whereas Figure 1 shows the genotype of the mutant intron embedded in knt. The sequences are different. We will revise the legend to Figure S5 to make this clearer.

      Figure S6 is a better analysis than what appears in the main text, and could be expanded to all base pairs.

      We will expand Figure S6 to include all base pairs as suggested. We disagree that this is a better analysis compared to what appears in the main text. Rather, it provides a complementary, hypothesis-driven view whereas the analysis in the main text is more systematic and unbiased in approach. *

      *

      Reviewer #2 (Significance (Required)):

      This manuscript largely focuses on the technical approach. The shift in analytic strategy described above would increase the conceptual impact. The conclusions are consistent with and fit in with recent uses of high-throughput sequencing to study RNA systems. For example Pitt & Ferré-D'Amaré, Science (2010) and Kobari et al, NAR (2015) describe fitness landscapes of the ligase and HDV ribozymes, respectively. Torgerson et al RNA (2018) make similar measurements on the glycine riboswitch, including a treatment of relative helix stability for two mutually exclusive conformations. The overall results are of interest to researchers in the field of noncoding RNA.

      We thank the reviewer for highlighting the paper by Torgerson et al, of which – embarrassingly – we were not aware. We will make reference to this paper in a revised manuscript and highlight that riboswitches might be a good model system to further explore asymmetric constraint and selection against excess stability in an evolutionary context (also see our response to point #9 above).

      As highlighted earlier, we think the main conceptual impact of our work lies not in the description of helical stabilities. Rather, it lies in a) providing a rigorous proof-of-principle that deep mutational scanning can capture multiple conformational states simultaneously, and b) that, using an unbiased machine learning approach, these states can be deconvoluted from a single fitness landscape to attribute the fitness impact of individual mutations to specific RNA conformations. A shift in analytical strategy to “cut to the chase” and narrowly focus on helical stability would be misguided in this context, as we seek to provide not only insights into the data at hand but also lay out a sound and general recipe for analysing similar datasets in the future.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary

      The manuscript by Soo et al probes the effect of mutations on the fitness of the Tetrahymena Group I self-splicing intron. They used high-throughput sequencing to simultaneously identify the effect of every possible sequence in a 4-bp helix. The approach is sound and the conclusions are generally supported. However, the analysis seems overly complicated given the dataset. Both the analysis and the accompanying writing make it difficult to understand what seems to be a fairly clear conclusion - that the relative stabilities of two alternative RNA helices are important for splicing.

      Major Comments

      1.The authors state that this method can identify the impact of transient conformational states. However, the two conformational states in this study are not transient - in fact they are associated with two distinct chemical steps of splicing and are quite stable. It may be that the effect of important transient states would be observed, but this study does not demonstrate that.

      2."Fitness" ends up being on an arbitrary scale, which impairs some analysis. A similar high-throughput sequencing pipeline could have been used to directly monitor splicing of every mutant, though at this point that is outside the scope of this study. Even with the arbitrary units, it would be clearer if more time were spent comparing fitness to base-pair stability on an individual basis, rather than the broad analyses. (See minor comments for details.)

      Minor Comments

      1.The sentence in the abstract beginning "Using an in vivo report system..." is very difficult to comprehend. This is due both to the length of the sentence and the word usage. The final sentence of the abstract is similarly difficult. In general, the writing overemphasizes complexity at the cost of clarity.

      2.Analysis of results in terms of "epistasis" obscures what could be a straightforward observation. This is the same as saying that mutants are not independent, or that their energetic costs are not additive. This follows obviously from the observation that the nucleotides being mutated are base-paired. The novel information is the sensitivity of fitness to base pairing. This is best shown in an analysis like Figure 3A (see below), not broad measures of epistasis.

      3.Figure 1C isn't necessary for the reader to understand the process.

      4.It is unclear what figure 2C is showing. It appears that the replicates are similar to each other, that 30 deg C and 37 deg C are also similar, but that +/- Kan are different. This probably doesn't need a figure in the main text.

      3.Figure 3A could be the most informative part of the manuscript. However, predicted minimum free energy should be on the x-axis as the independent variable. The expectation then is that you would see a peak in fitness at some free energy, with fitness falling off both with increased and decreased stability. Furthermore, there should be more analysis along these lines. The authors should calculate helical stability for both P1ex and P10 for every mutant and compare with fitness. Mutations which affect both could also be separated out. Figure 4C comes the closest to this but views it only in terms of GC pairs; there is no reason not to quantify the energetic effects given that predictions of stability for helices is quite good. Deviations from a model invoking only helical stabilities would indicate another factor is involved (alternative base-pairing or tertiary structure, for example).

      4.There appears to be a missing verb in the legend for figure 3A, second sentence.

      5.Figure S5 appears to be redundant with Figure 1.

      6.Figure S6 is a better analysis than what appears in the main text, and could be expanded to all base pairs.

      Significance

      This manuscript largely focuses on the technical approach. The shift in analytic strategy described above would increase the conceptual impact. The conclusions are consistent with and fit in with recent uses of high-throughput sequencing to study RNA systems. For example Pitt & Ferré-D'Amaré, Science (2010) and Kobari et al, NAR (2015) describe fitness landscapes of the ligase and HDV ribozymes, respectively. Torgerson et al RNA (2018) make similar measurements on the glycine riboswitch, including a treatment of relative helix stability for two mutually exclusive conformations. The overall results are of interest to researchers in the field of noncoding RNA.

      Our expertise is in RNA biochemistry and biophysics. We are not qualified to evaluate the details of several of the computational pipelines described.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      The authors constructed a virtually complete fitness landscape of the P1 extension region (4-base-paired helix) in the group I intron from Tetrahymena thermophila, using a kanamycin resistance reporter to evaluate the fold-change in fitness, which is related to self-splicing activity. This was a clever choice of system because it was known from earlier work that the P1 extension adopts two different conformations during self-splicing. The fitness of each variant was determined from the number of reads acquired from the sequencing data sets and analyzed through an extensive computational pipeline.

      The strength of the paper is that this machine learning approach can be used to calculate how individual variants contribute to the fitness landscape and assess the directions of epistasis across a large number of identified genotypes. The authors argue that machine learning more successfully models subtle effects that arise from interactions between RNA residues, and that the power to analyze deep mutational sequencing experiments can better rationalize fitness constraints arising from multiple conformational states. The results are mostly consistent with previous studies even though the authors collected the data in a more advanced and complicated way. They are also able to rationalize complex phenotypes - for example, the observed fitness defects are more prevalent under an unfavorable growth condition (30{degree sign}C), because the lower temperature hinders conformational exchange. Although such cold sensitive effects are well known in RNA, it is gratifying that this can be captured in the fitness landscape.

      Despite these strengths, there are several weaknesses that should ideally be addressed before publication.

      1.The results would be more convincing if the authors directly measure the self-splicing activity of a few key variants, such as the C2C21 mutant, to determine whether these mutations alter the self-splicing mechanism of the Tte-119(C20A) master sequence in the way that they infer from their model. In interpreting their results, they may want to consider misfolding of the intron core (coupled to base pairing of P1) and reverse self-splicing. Reversibility in the hairpin ribozyme, for example, turned out to be the key for understanding the effects of certain mutations.

      2.Related to the point above, interesting conclusions regarding the relationships between base identity and epistasis that arise from metastability should be strengthened with additional examples. For example, the authors can explain why a reverse base-pairing variant (C3G20) exhibits negative epistasis but is not similar to that of the G3C20 construct. This would ideally use the data from the screen but also be validated by checking the self-splicing activity of a few individuals at low and high temperature.

      3.They should validate the screen by showing that kanamycin resistance does indeed correlate strictly with self-splicing activity, and not some other feature such as RNA turnover. (It would also not be a bad idea to check this in the cell, which can be done by primer extension or Northern blotting.)

      4.The benefit of the machine learning model is that it can extract signals that may be hard to detect otherwise. The downside is that it doesn't produce a physical model, as far as I am aware. The parameters are themselves not meaningful - except to the degree that trends in the fitness estimates can be explained after the fact. This is something that should ideally be explained more directly in the manuscript.

      5.The authors claim that by evaluating a large number of sequences at two conditions, they can capture variants with intermediate phenotypes (Fig. 1). This is not necessarily true. If the original screen allows only the most active variants to survive on kan+ medium, then the signature of intermediate phenotypes may not be encoded in the original data, and thus not retrievable even with sophisticated algorithms, which may also be prone to overfitting. At what limit of stringency will the screen fail to yield information about intermediate fitness? How deeply must one sequence to recover this information, especially if noisy or degraded? Some discussion of these effects would be helpful.

      6.Lastly, the evolvability of RNA is fascinating and there is much to learn. However, the authors don't discuss the implications of their findings for molecular evolution although they throw the term around. It would be exciting if there is a trend in the fitness landscape that could help explain the trajectory of RNA evolution in nature.

      7.The authors use the abbreviation DMS for deep mutational scanning; the RNA structure field uses the reagent dimethylsulfate that is also abbreviated DMS. They may want to choose a different acronym or just avoid an acronym altogether.

      Significance

      As the importance of RNA structure for gene expression becomes more widely appreciated, interest in understanding the evolution of RNA structures is also increasing. Compared with the molecular evolution of proteins, evolution and fitness in RNA is far less understood, although the authors appropriately point to a number of recent studies on this topic. The main advance here is to use machine learning methods to analyze the results of a large genotypic screen, with the goal of more accurately capturing the fitness effects of sequences at varied distances from the parental sequence. The specific conclusions reached here such as the importance of metastability or the prominence of cold sensitive effects are not revolutionary, but the authors illustrate how such phenomena can be investigated more systematically and in more depth.

  2. Sep 2020
    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Dear reviewers,

      Thank you very much for your constructive and helpful remarks and suggestions!

      We marked the changes in the manuscript in yellow.

      Our replies to the specific points:

      Reviewer #1 In the Introduction the authors need to cite earlier work in Chlamydomonas which first showed that binding of specific proteins to the psbA 5'UTR is correlated with increased translation in the light (Danon et al. 1991).

      As suggested, we added the reference to the introduction.

      Reviewer #1 The paper could be improved by testing for protein binding to the footprint region in high vs low light. An obvious candidate is HCF173.

      We agree that HCF173 is an obvious candidate, although its interaction could be mediated via additional proteins. Alice Barkan’s group has demonstrated that in maize HCF173 binds to the same region upstream of the translation initiation region (McDermott et al., 2019) where we detected a footprint (Supplemental Figure S11A-D). Furthermore, McDermott et al showed that the binding sequence is conserved. We would like to analyze this question in more detail, but we have currently in the lab no approach available to specifically isolate psbA mRNA with its bound proteins for this analysis and therefore have to postpone the answer to this question to future studies.

      Reviewer #2: \*Important changes to make before full submission:** 1)It is becoming clear that the translation efficiency (TE) is often not a calculation of translational output from specific mRNAs but in fact is better to be described as ribosome association. There can be many reasons for increased ribosome association including ribosome stalling and increased translational engagement. It would be good for the authors to add a simple Western blot to demonstrate directly increased protein output from psbA during high light as compared to low light treatments. This figure could be added to Figure S1.*

      We want to stress that we have chosen a condition that is well known to increase psbA translation in higher plants as shown in the literature with different methods (e.g. Chotewutmontri and Barkan, 2018; Schuster et al., 2020). The protein encoded by psbA, the D1 subunit of photosystem II, has an increased turnover in high light, i.e. a higher amount of D1 has to be produced to compensate for the increased degradation of photodamaged D1 (Mulo et al., 2012; Li et al., 2018).

      Although there is a lot of evidence in the literature for good correlation of translation efficiency as determined by ribosome profiling and protein synthesis, the reviewer raised a valid concern. Ribosome pausing or even ribosome stalling could also cause increased ribosome binding and thereby increased amounts of ribosome footprints. Therefore, we analyzed ribosome pausing in selected genes including psbA and rbcL. The pattern of ribosome pausing was very similar in low and high light (new Supplemental Figure 14), which rules out any ribosome stalling at specific sites or drastic changes in ribosome pausing. To analyze if there is increased ribosome pausing, we determined the fraction of footprints at pause sites compared to the total number of footprints. We used two different pause scores as cutoffs to determine pause sites. To include as many pausing events as possible, we used a pause score of 1, i.e. everything higher than the mean ribosome density per nucleotide of the corresponding coding region (Gawronski et al., 2018). This fraction was unaltered in low and high light (new Supplemental Figure 14). With a more stringent pause score of 20 (20 times higher ribosome density than the mean), an increase of ribsome pausing in high light was detected for psbA, whereas we did not find differences between high and low light for rbcL and psaA. However, this increase in pausing at the psbA mRNA is insufficient to explain the increase in the total amounts of ribosome footprints. Additional pause scores were tested, the value for the psbA fraction with a pause score of 20 included in Supplemental Figure S14 showed the largest difference.

      Reviewer #2: \*Strongly suggested additions to the manuscript to improve its significance before publication** 1)Identifying the RNA-binding protein(s) (likey HCF173 which may be in a complex with other proteins) that interacts with the 5' UTR of psbA in a highlight dependent manner would increase the significance of this study. Finding that this protein binds to other plastid transcripts with weak Shine-Delgarno sequences would also be a nice addition to this study.*

      See comment to reviewer 1. McDermott et al. (2019) describe HCF173 as relatively specific for psbA. Therefore, we do not assume that other genes with weak Shine-Dalgarno sequences are regulated via HCF173 but via different proteins using a similar molecular mechanism to influence the mRNA secondary structure at the translation initiation region.

      Reviewer #2: \*Strongly suggested additions to the manuscript to improve its significance before publication** 2)Mutational analysis of the RBP binding site and also to change the secondary structure around the start codon based on the new structure maps to show the effects of these various changes on protein output would really provide important new findings on how important the RBP being as compared to the RNA secondary structure changes are for regulating protein output form psbA. It could also allow the demonstration of the dependence or independence of these two features on regulating translation from chloroplast mRNAs.*

      We agree with the reviewer that this would be a very interesting study. Unfortunately, it requires a larger collection of lines with mutated psbA sequences. Plastid transformation in Arabidopsis thaliana is still technically demanding and time consuming. Even in the case of Nicotiana tabacum, for which plastid transformation is well established, such a project would likely need several years. We therefore think that such a study is beyond the scope of the current manuscript.

      Reviewer #3 1.In this paper, author mentioned that DMS can modify four nucleotides under alkaline conditions. Because the chloroplast is slightly alkaline, the authors use DMS reactivity from 4 nucleotides to model RNA secondary structure. Based on Kevin Weeks' s paper, it shows that in cell-free condition, DMS has very limited ability to modify single-stranded G and U compared to A and C (Anthony M. Mustoe et al., 2019, PNAS 116: 24574. fig. 1B). In Lars B. Scharff' paper which is cited by the author, it is also mentioned that A and C is more reliable to model RNA secondary structure. The authors might need to calculate the correlation the DMS data and known RNA structure using G/U or all four nucleotides to show that DMS reactivity from G and U is also reliable to be used. Also in Fig. S3B, the reproducibility of G/U between replicates is not as good as A/C. I don' t think G and U can be used to predict RSS.

      We agree with the reviewer that DMS reactivities at G/U are less reliable than those at A/C. This was shown by Mustoe et al. (2019) and by us for chloroplast rRNAs (Gawronski et al., 2020, Plants). We included a correlation of the known 16S rRNA secondary structure and the DMS reactivities at the different nucleotides (Supplemental Figure S5A) that demonstrates that the DMS reactivities at G/U actually contain information about rRNA secondary structure. This analysis demonstrated again that the reactivities at G/U are less reliable than at A/C. Therefore, we added an analysis of the more reliable A/C for comparison with the results for all four nucleotides (Figure 1D-F, 3C-F).

      Reviewer #3 2.Is the 5'UTR the only region which has RSS change? If not, how do RSS changes in other region contribute to translation?

      Translation initiation in plastids is mainly influenced by the secondary structure of the translation initiation region, especially at the cis-elements required for the recognition of the start codon. In addition, we have analyzed different other regions, e.g. the coding regions, the coding regions without the sequences next to the start codon, the end of the coding region, and the complete 5’ UTR (Supplemental Figure S14). We added a more detailed analysis of the changes of secondary structure of the coding region of those genes we focus on (Supplemental Figure S16). This shows that the secondary structure changes of the complete coding region correlate negatively with translation efficiency (see also Supplemental Figure S14G). A similar observation was made in E. coli and explained to be caused by differences in translation initiation, which are mainly influenced by the secondary structure of the translation initiation region (Mustoe et al., 2018).

      Reviewer #3 3.In Fig. 2A and 2B, the DMS reactivities seem very similar under low light and high light. Why did the authors obtain significantly different RNA secondary structure? Are the parameter of low light and high light the same when modelling RNA structure?

      The parameters for the RNA secondary structure predictions in Figure 2 are not identical (see Figure legend). For all structure predictions, the DMS reactivities were used as constrains, but only for the high light structure the sequence of the RNA binding protein’s footprint was forced to be single-stranded. These structure predictions are included to illustrate the mRNA structures in the presence and absence of an RNA binding protein. These structures are based on the observation that the two halves of the stem loop structure have different DMS reactivities in response to high light. The sequence including the protein footprint has lower DMS reactivities in both low and high light. This is in agreement with both a double-stranded sequence as well as a protein-bound sequence. In contrast, the other half of the stem loop, the sequence including the cis-elements of the translation initiation region, has increased DMS reactivities in high light, indicating that it is single-stranded. This suggests that there is protein binding in high light preventing the formation of the inhibitory stem loop.

      Reviewer #3 4.In Fig. S12, the correlationship between HL and LL in ribo-seq and RNAseq is high, which means no significant changes upon light change. In this paper, psbA should have translation change under high light conditions. I suggest the authors to label the dot representing psbA.

      Thank you very much for this suggestion! We marked psbA in the correlation plots (Supplemental Figure 12). The changes in the transcript levels are really minor, whereas for some genes the translation efficiency changes (see Figure 4 and Supplemental Figure S13).

      Reviewer #3 5.I suggest to use plants at the same stage for DMS-MaPseq and SHAPE probing.

      The different plant material was chosen because of the different requirements during probing. In this context, we would like to point out that observing the same changes in the translation initiation region in response to high light in different developmental stages is a stronger confirmation than observing the same response at the same developmental stage. This indicates that the response is not specific for a developmental stage.

      Reviewer #3 6.In Huang's paper (Jianyan Huang et al., 2019, Cell Reports 29: 4186-4199), there are many differential express genes under high light for 0.5hr. However, in the RNAseq data here, the correlation between high light and low light conditions is very high (Fig. S12). Why? Also, it would be nice if the authors could label several DEG whose expression change under high light treatment in Fig. S12?

      Supplemental Figure S12 contains only plastid-encoded RNAs, whereas Huang et al. (2019) focused on nuclear-encoded mRNAs. We clarified the figure legend of Supplemental Figure S12 by adding “of the plastid-encoded genes”. The values for the individual genes can be seen in Supplemental Figure S13.

      Reviewer #3 7.For the MNase footprint method, is the as-SD region the only region show enrichment under high light conditions? Besides, please provide the detailed method of MNase footprint. Does it work for RNA footprinting?

      The used methods are described under “Ribosome profiling (Ribo-seq)” and “Processing of Ribo-seq and RNA-seq reads” in Material and Methods. The approach was very similar to the one used for ribosome profiling with the difference that also smaller read lengths were included in the analysis (18-40 nt instead of 28-40 nt). We did this, because many plastid RNA binding proteins have footprints that are smaller than a ribosomal footprint. The described footprint is the only one detected near the translation initiation region of psbA. Binding of HCF173 was detected by the Barkan group in the same region using a RIP-Seq Analysis combined with RNase I digestion (McDermott et al., 2019), which confirms that our approach is working. We added a reference to the method section in the results part to clarify which approach was chosen.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      RNA can fold into secondary and tertiary structure through base-pairing. RNA structure plays a crucial role in gene functions and regulations, including transcription, processing, translation and decay. Plants acclimate to fluctuating light conditions to optimize photosynthesis and minimize photodamage. Translational regulation is known to be a strategy of these acclimations. It reported that translation of psbA, encoding the D1 reaction center protein of Photosystem II, is increased under high light condition. The light-controlled psbA translation has been intensively studied and was suggested to be related with redox/thiol signals, the ATP status, and some certain proteins. In this ms, Gawroński et al. explored the possible link between RNA secondary structure and translational efficiency. They adopted DMS-MaPseq and SHAPE-seq methods to profile the RNA secondary structure in 5UTR of psbA under low light and high light conditions. The results showed that the DMS and SHAPE activities of Shine-Dalgarno (SD) sequence, star codon and as-SD region are higher under high light condition than that under low light control, indicating that the psbA translation initiation region becomes more single-strandeness and accessible under high light condition. MNase-digestion and DMS activity analysis suggested that protein binding might cause the change of RNA secondary structure of psbA translation initiation region. In addition, the authors probed the RNA secondary structure of the translation initiation region of rbcL that encodes the large subunit of Rubisco and found no change in RNA structure of rbcL, while the translation of rbcL is also increased under high light condition. To address the question that RNA structure changes is related with high light-dependent translational activation of psbA but not rbcL, plastome-wide translational efficiency and RNA structure were analyzed. The results showed that a significant correlation between the RNA secondary changes and translational efficiency changes in the chloroplast-coded mRNAs with week SDs (such as psbA), but not with strong SDs (such as rbcL).

      The light-dependent translational activation of psbA is critical for maintaining photosynthetic homeostasis. Also, the molecular mechanism of RSS's impact on translation is still exclusive The topic of this study is very important. However, this study just described the phenomenon of RNA secondary structure changes in translational initiation region, but does not give further evidence to validate the effect of RNA secondary changes on the translational activation of psbA under high light condition. Besides, the evidence of protein binding causing RNA structure changes is week and unclear. In addition, there is much room for improvement for this work

      1.In this paper, author mentioned that DMS can modify four nucleotides under alkaline conditions. Because the chloroplast is slightly alkaline, the authors use DMS reactivity from 4 nucleotides to model RNA secondary structure. Based on Kevin Weeks' s paper, it shows that in cell-free condition, DMS has very limited ability to modify single-stranded G and U compared to A and C (Anthony M. Mustoe et al., 2019, PNAS 116: 24574. fig. 1B). In Lars B. Scharff' paper which is cited by the author, it is also mentioned that A and C is more reliable to model RNA secondary structure. The authors might need to calculate the correlation the DMS data and known RNA structure using G/U or all four nucleotides to show that DMS reactivity from G and U is also reliable to be used. Also in Fig. S3B, the reproducibility of G/U between replicates is not as good as A/C. I don' t think G and U can be used to predict RSS.

      2.Is the 5'UTR the only region which has RSS change? If not, how do RSS changes in other region contribute to translation?

      3.In Fig. 2A and 2B, the DMS reactivities seem very similar under low light and high light. Why did the authors obtain significantly different RNA secondary structure? Are the parameter of low light and high light the same when modelling RNA structure?

      4.In Fig. S12, the correlationship between HL and LL in ribo-seq and RNAseq is high, which means no significant changes upon light change. In this paper, psbA should have translation change under high light conditions. I suggest the authors to label the dot representing psbA.

      5.I suggest to use plants at the same stage for DMS-MaPseq and SHAPE probing.

      6.In Huang's paper (Jianyan Huang et al., 2019, Cell Reports 29: 4186-4199), there are many differential express genes under high light for 0.5hr. However, in the RNAseq data here, the correlation between high light and low light conditions is very high (Fig. S12). Why? Also, it would be nice if the authors could label several DEG whose expression change under high light treatment in Fig. S12?

      7.For the MNase footprint method, is the as-SD region the only region show enrichment under high light conditions? Besides, please provide the detailed method of MNase footprint. Does it work for RNA footprinting?

      Significance

      see above

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      This study uses multiple high-throughput sequencing approaches to probe the secondary structure of the chloroplasitc psbA mRNA during low and high light treatments. They are able to demonstrate a shift in secondary structure around the start codon of this mRNA in response to the high light treatment as compared to under low light conditions. This structural shift is also accompanied by an RBP binding even that may also be involved in regulating the translation from this mRNA in response to high light. I think this study is very interesting and timely. However, I think determining the relative contributions of the secondary structure and RBP binding changes to potential increases in protein outputs from this mRNA in response to high light would improve this manuscript. I also think directly looking at protein levels through a straight-forward Western blot to show increase psbA protein in response to high light treatment is an important addition to this study. I outline my few suggested experimental additions for this manuscript below.

      Important changes to make before full submission:

      1)It is becoming clear that the translation efficiency (TE) is often not a calculation of translational output from specific mRNAs but in fact is better to be described as ribosome association. There can be many reasons for increased ribosome association including ribosome stalling and increased translational engagement. It would be good for the authors to add a simple Western blot to demonstrate directly increased protein output from psbA during high light as compared to low light treatments. This figure could be added to Figure S1.

      Strongly suggested additions to the manuscript to improve its significance before publication

      1)Identifying the RNA-binding protein(s) (likey HCF173 which may be in a complex with other proteins) that interacts with the 5' UTR of psbA in a highlight dependent manner would increase the significance of this study. Finding that this protein binds to other plastid transcripts with weak Shine-Delgarno sequences would also be a nice addition to this study.

      2)Mutational analysis of the RBP binding site and also to change the secondary structure around the start codon based on the new structure maps to show the effects of these various changes on protein output would really provide important new findings on how important the RBP being as compared to the RNA secondary structure changes are for regulating protein output form psbA. It could also allow the demonstration of the dependence or independence of these two features on regulating translation from chloroplast mRNAs.

      Significance

      This study definitely focuses on a research topic that is currently of interest and highly timely.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      This manuscript addresses the regulation of chloroplast translation, an important topic in chloroplast biology. The authors show that specific changes in the secondary structure of the 5'UTR of the psbA mRNA involving the Shine-Dalgarno sequence and the AUG initiation codon can be correlated with changes in translational efficiency during a low light to high light shift. Based on indirect evidence they propose that this may be caused by binding of specific proteins to this region. They also show that this correlation appears to be valid to some extent for other mRNAs with a weak SD sequence. The technical quality of this manuscript is excellent and the manuscript is clearly written.

      Additional remarks

      In the Introduction the authors need to cite earlier work in Chlamydomonas which first showed that binding of specific proteins to the psbA 5'UTR is correlated with increased translation in the light (Danon et al. 1991). The paper could be improved by testing for protein binding to the footprint region in high vs low light. An obvious candidate is HCF173.

      Significance

      This work provides valuable new insights into the molecular mechanisms involving the psbA 5'UTR in the initiation of chloroplast translation.

      This work will be of interest to a wide audience interested in the mechanisms of translational regulation.

      My expertise is in chloroplast biogenesis and in assembly and regulation of the photosynthetic apparatus

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      Molenaars et al., describe a protocol to extract and quantify a wide range of polar and apolar metabolites from the same C. elegans sample using methanol-chloroform based phase separation. The authors assess the method across different input amounts, in comparison to a 1-phase extraction method and through metabolic perturbations using RNAi against several metabolic enzymes. Finally, they provide a metabolomics analysis of metabolite variation across several C. elegans strains. The data are of overall high quality and presented in a clearly written manuscript.

      We really appreciate the positive words from the reviewer.

      To help assessing the value of the method to other approaches, several controls are suggested below:

      1.Fig.1: Metabolite abundance in the polar phase should be compared to 1-phase extraction methods (analogous to Fig. 2I, which compares metabolites in the apolar phase to 1-phase extraction)

      We acknowledge the apparent asymmetry in the text; comparing our two-phase method to a single phase lipidomics method indeed suggests a similar comparison for metabolomics. However, our established polar metabolomics method has always been based on this exact two-phase extraction. The current method exclusively asks whether it is possible to integrate our dedicated lipidomics platform into our established two-phase polar metabolomics method, by utilizing the apolar phase that is usually discarded. This way, the method enables comprehensive metabolomics/lipidomics screening while limiting the need of culturing twice the amount of material.

      Our manuscript does not necessarily ask the more fundamental question of the advantages of a one-phase vs two-phase extraction for polar metabolites. Interestingly, the one-phase vs two-phase metabolomics methods have been compared previously and the authors show here that the two-phase method achieved broader metabolite coverage, satisfactory extraction reproducibility, acceptable recovery and safety (DOI: 10.1038/srep38885). This is most probably due to the cHILIC column being sensitive for contamination and therefore excluding lipids from your samples is beneficial for measuring polar metabolites. We hence believe that developing a single phase polar method would appear superfluous for the purpose of this study.

      2.Are polar metabolites also detected in the apolar phase? Can the less hydrophobic lipids missing from the apolar phase detected in the polar phase?

      This is an interesting question that mostly relates to the lyso-lipids that are not detected in the lipid phase of our two-phase extraction. The first point to make is that sample solvents that are used at the final stage of extraction are not compatible between methods. In other words, the solvent we normally use for the lipids phase (xxx) cannot be injected on the cHILIC column. So, in a practical sense, we would not be able to measure these compounds, even if they would technically be dissolved in the other layer. However, we tried a few different alternative approaches to get more information on this point:

      We have attempted to integrate the lyso-lipids in the cHILIC measurements, in the polar layer, using the polar sample solvents. This was unsuccessful; no reproducible peaks, not even the internal standards, were measured. We will include a note on these results in our manuscript. We have, albeit for a different sample matrix, attempted to dissolve both layers of the two-phase extraction in the cHILIC sample solvents. While we cannot guarantee this for all metabolites, it appears that most polar metabolites are exclusively found in the polar layer. We were not able to integrate even a single peak from any of the sugar, amino acids, nucleotides, etc in the apolar layer dissolved in polar solvents. We have reconstituted both the polar and apolar layer of our two-phase extraction in 50:50 methanol:chloroform and analyzed them on the lipidomics platform. We did find some of the lipid internal standards partition to the polar phase, especially LPG (and to a lesser extent LPE and LPA) compared to for instance PE, SM, PG and PC that all end up in the apolar phase. We will include these data in the revised manuscript as a supplemental figure as it demonstrates that the lyso-lipids are poorly measured in the two-phase extraction. This is also why in the text we advise to use the dedicated one-phase extraction when interested primarily in these species.

      3.Fig.3l-n: The authors claim that extracting metabolites from the polar and apolar phases of the same sample leads to better cross-correlation than if metabolites are extracted from different samples using methods optimized for the respective metabolite classes. To provide experimental evidence, metabolite abundance should be compared directly when metabolites are extracted from the same or from different samples using suitable methods.

      We agree with this point. We will amend the text to not overstate these advantages.

      Reviewer #1 (Significance (Required)):

      The methodological and conceptual advancement of the present study is rather incremental. The authors essentially use the classical chloroform/methanol/water phase separation protocols developed by Bligh & Dyer and Folch, which have been used extensively for lipid extraction for many decades now. However, the effort to carefully measure the metabolites contained in the aqueous phase is laudable. For method validation, the authors use well-understood perturbations that yield predictable results. Overall, I consider the study more appropriate for a publication as a methods protocol, which could be of interest to the metabolomics community, rather than as a research paper.

      We agree; our goal was indeed to create and share a method, we will make sure to emphasize this in our cover letter.

      While the extraction method we use is not novel per se and based on classical extraction procedures, it is important to underscore that we are only now able to use these extractions in combination with high-resolution mass spectrometry. This opens new opportunities for basic discovery. The efficiency we achieve by using both phases of the two-phase procedure makes our method highly attractive for hypothesis generation, especially in sample sets where limited amounts of material are available.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      The authors provide a detailed description of a method to analyse both polar as well as lipophilic metabolites from the same nematode sample. This provides significant advantages over methods using individual samples. Moreover and by using internal standards they establish an extremely good correlation of individual metabolites. This paper is of immediate importance for the worms community and beyond.

      We are very grateful to receive this positive response from the reviewer and for highlighting the advantages of our described method also beyond the worm community.

      **Major comments:**

      none **Minor comments:**

      The correction process using internal standards could be described a bit more detailed.

      In our revised manuscript, we will describe the internal standard use and corrections in more detail in the text. In summary: internal standards are selected for specific metabolites based on their Pearson correlation and %CV. Subsequently, metabolite peak areas were divided by the area of the appropriate internal standard. This corrects for any loss of sample during sample prep, for instance during the isolation of the two layers.

      Jenni Watts has written a nice Worm Book chapter on lipids which may be cited in addition to reference 17, since it covers many of the metabolites and related enzymes contained in this manuscript

      We will include a reference to this Worm book chapter reviewing fat regulation in C. elegans in our paper, thank you for the suggestion.

      Reviewer #2 (Significance (Required)):

      see above

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      The manuscript is well written and consider. However, there is room for further improvements:

      We thank the reviewer for the positive response and for the suggestions raised.

      1) Author need to write exactly how many metabolites not just >, semi-quantitative analysis of >100 polar (metabolomics) and >1000 apolar (lipidomics) metabolites in C. elegans, for example they did with other papers in Table 1

      We understand that this might appear vague. The notation was a compromise, based on the following considerations:

      1. The maximum number of reported metabolites can be different to the number of analyzed metabolites in a specific experiment or even a specific sample. For instance, our method is perfectly capable of measuring creatine metabolism –we have standards for these metabolites and they can be reliably measured–, however we have not yet been able to detect these metabolites in elegans. Some mutants also lose abundance of a certain metabolite to the point of it not being reliably measurable, which means they are filtered out in the bioinformatics.
      2. Since the initial draft of our manuscript we have been able, and will continue to be able, to add new metabolites to our analysis, as we perform a full scan over the range of m/z 50-1200. Because of this, we felt it more accurate to state that we can measure >100 metabolites, instead of a specific number.

        2) Authors also need to clarify on number of samples in the result section while describing the statistical analysis.

      We understand this point raised by the reviewer and will specify not only the number of samples, but also that they are indeed biological replicates. This will be included in the figure legends.

      Reviewer #3 (Significance (Required)):

      This might be interesting paper for the research community who work with C.elegans (metabolism or in general)

      Thank you, we are in fact utilizing this double extraction for other non-worm samples such as mice an human tissues and we believe this could also benefit the research community beyond the model organism C. elegans.

      The authors must deposit the raw data and make it available for the public, so they could also benefit from this good work.

      It is our full intention to share our data in a convenient and standardized way through for instance the MetaboLights database (https://www.ebi.ac.uk/metabolights/). We agree and changes will be implemented as suggested.

      Reviewer #4 (Evidence, reproducibility and clarity (Required)):

      **Summary:** The authors present a method for extraction of both lipid and polar metabolites from the model organism C. elegans. This extraction method is based on the well-established Blyth and Dyer method, with a slight modification to retain and utilize both the organic and non-polar fractions for LCMS analysis. They applied and tested this method against a monophasic extraction utilizing the same solvent system. They report that there is a loss of metabolites in the non-polar fraction to the polar fraction (of more polar metabolites) and small differences between the monophasic and biphasic extractions. They also expanded on the linearity of the extraction efficiency by increasing the number of worms. Further they applied the single extraction method to both knockdown mutants of C. elegans and Recombinant Inbred Lines derived from N2 and the natural isolate CB4856 to determine whether this method would still be able to differentiate the metabolome between the genetically different C. elegans populations.

      We thank the reviewer for their comments and suggestions.

      **Major comments:**

      *Are the key conclusions convincing?*

      As a whole the conclusions are convincing and valid.

      We appreciate that the reviewer considers our work convincing and valid.

      *Should the authors qualify some of their claims as preliminary or speculative, or remove them altogether?*

      The use of the adjective "robust" is, to an extent, erroneous. As defined, a robust method implies that the method is capable of withstanding small (deliberate or not) changes or variations. In this case the robustness of the method was not assessed and not clear how replication was carried out.

      We have in fact performed analysis on both biological replicates and repeated injections of pooled samples to determine robustness. We will clarify the biological replicates in the text and will place the pooled QC samples in the main text with additional explanation and relevant statistics such as % coefficient of variance (%CV) between them. For clarity, we plotted %CV of all polar as well as apolar metabolites. For polar metabolites 97% of the metabolites had a %CV lower than 30. For apolar metabolites 86% of the metabolites had a %CV lower than 30.

      *Would additional experiments be essential to support the claims of the paper? Request additional experiments only where necessary for the paper as it is, and do not ask authors to open new lines of experimentation.*

      Reproducibility would need to be assessed/quantified to establish how robust the method is. Even though linearity with an increase in the number of worms is a good indication, it does not satisfactorily establish the robustness of the method. The use of replicates to assess the agreement between measurements (i.e. bland-Altman plots), linearity as well as coefficients of variation (included in the sup material but not clear in the body of the manuscript) would characterize the methods best. The isolation of each variance originating from instrumental (pooled quality controls), biological (biological replication) and sample preparation (multiple extractions from the same biological source) is critical.

      We have these data and will elaborate on this in our revised manuscript. We will discuss the quality control samples more prominently in the main body of the manuscript, and show one or more figures that specifically address both analytical and biological variance (see rebuttal figure 2). In summary, we assessed this variance using (a) a repeated injection of a pooled QC sample, and (b) biological replicates prepared individually. Especially the latter condition, in which we assess biological variance is representative for the actual method application. The %CV under these conditions is ≤20% for the majority of metabolites, which is why we consider our method robust.

      *Are the suggested experiments realistic in terms of time and resources? It would help if you could add an estimated cost and time investment for substantial experiments.*

      The suggested experiments are in-fact just further analysis with the already collected data. There would be no need for further experiments, however it is not clear whether pooled QCs/or reference materials were used and the number of replicates per experimental design.

      All the data are available. These analyses will be included in the revision.

      *Are the data and the methods presented in such a way that they can be reproduced?*

      The methods are very well described. My only comment is to address how the replicates were grown/created and how many per strain/group. If the replicate measurements were done on the same samples (repeated injections), I believe that would weaken the findings (if not invalidate them altogether), however if these were biological replicates from independent starting populations the findings are valid and convincing.

      We performed bona fide biological replicates. We will explicitly mention this in the paper together with the other descriptions of our validation protocols.

      *Are the experiments adequately replicated and statistical analysis adequate?*

      As per my above comments.

      **Minor comments:**

      *Specific experimental issues that are easily addressable.*

      It is not clear how the sample preparation process was carried out (randomization, run order, QCs etc). As per the guidelines widely accepted from –Broadhurst, D., Goodacre, R., Reinke, S.N. et al. Guidelines and considerations for the use of system suitability and quality control samples in mass spectrometry assays applied in untargeted clinical metabolomic studies. Metabolomics 14, 72 (2018). https://doi.org/10.1007/s11306-018-1367-3.

      We will provide details on the analysis itself in a table. In summary: Samples were measured in a random order, with blanks and QC samples throughout the run.

      *Are prior studies referenced appropriately?*

      A major reference that has applied this extraction method before in the same model organism is missing:

      Castro, C., Sar, F., Shaw, W.R. et al. A metabolomic strategy defines the regulation of lipid content and global metabolism by Δ9 desaturases in Caenorhabditis elegans. BMC Genomics 13, 36 (2012). https://doi.org/10.1186/1471-2164-13-36

      We will include this paper in our references. We would like to note though that this method requires not just an LC system to analyze lipids, but also GC with additional derivatization steps. Our method achieves comprehensive lipidomics using a single technique and no additional derivatization.

      Further a recent publication that goes beyond the work described by the authors using similar approach: MPLEx: a Robust and Universal Protocol for Single-Sample Integrative Proteomic, Metabolomic, and Lipidomic Analyses. Ernesto S. Nakayasu, Carrie D. Nicora, Amy C. Sims, Kristin E. Burnum-Johnson, Young-Mo Kim, Jennifer E. Kyle, Melissa M. Matzke, Anil K. Shukla, Rosalie K. Chu, Athena A. Schepmoes, Jon M. Jacobs, Ralph S. Baric, Bobbie-Jo Webb-Robertson, Richard D. Smith, Thomas O. Metz mSystems May 2016, 1 (3) e00043-16; DOI: 10.1128/mSystems.00043-16

      We will also include this paper, reporting 51 polar metabolites and 84 lipid species, in our references. While we recognize that they also make use of both phases and the protein pellet, we think our method is much more practical in several key ways:

      Our metabolomics platform provides twice as many species and our lipids platform exceeds their analytical capabilities 10 fold. This means a far better coverage of differences within metabolite and lipid classes, allowing for far more intricate patterns to be detected. We show this for instance in our plots comparing carbon chain length to degree of saturation (Fig 4 and S2 in original manuscript); a comparison that is only possible with the data density that our method offers. The MPLEx metabolomics method also requires the use of a GC system and derivatization steps, while our method does not, making it much more user friendly and requiring only a single analytical system.

      *Are the text and figures clear and accurate?*

      Yes *Do you have suggestions that would help the authors improve the presentation of their data and conclusions? *

      The figures, overall are of exceptional quality.

      As per current scientific consensus, Box plots should also be overlaid with the actual datapoints (which was aptly done for the bar charts and other plots).

      The supplementary data even though comprehensive is hard to understand. A "readme" file detailing what data each file contains would improve readability and comply with FAIR principles.

      We agree that a readme file would make the supplemental data more understandable. We will provide such a file. For the box plots we will show the actual data points in our revised manuscript.

      Reviewer #4 (Significance (Required)):

      Even though the approach is not novel and has long been used in Natural Products Chemistry and in other organisms, it's highly significant to set an extraction method standard for the field of C. elegans metabolomics (including myself doing metabolomics and natural products chemistry with LCMS and NMR). However, this manuscript does not cover the technical aspects of the method with sufficient depth to hallmark this method as the standard for the field. Further information is needed to fill the missing gaps (as highlighted by the authors). Ratios between solvent and biological material amounts, reproducibility, recovery rates (even though buried in the supplementary files) and metabolite coverage are still missing.

      As a side note, the disparity between the monophasic and biphasic extractions could be overcome by a sequential extraction of the same sample, with no incurred cost on performance (and removing the much-dreaded pipetting uncertainty near the line between solvents). The second aspect of the manuscript, which initially was a welcoming idea (and important), became >50% of the manuscript creating a disconnect between the information set by the abstract and introduction and the results/conclusion. The work is extremely relevant in both sections of the manuscript, but the technical aspect is still lacking details and/or analysis.

      Strongly suggested: explicit compliance with the minimum reporting standards as per the Metabolomics Standards Initiative (MSI) and deposition of the data to a metabolomics repository (i.e. Metabolights or Metabolomics Workbench). These are internationally accepted requirements for metabolomics publications.

      We are aware that the extraction itself is an analytical chemistry staple. However, it is precisely in this fact that we find novelty. It should be noted that both of the other papers mentioned by the reviewers that have attempted to integrate lipidomics and metabolomics have had to resort to labor intensive (as well as possibly expensive and destructive) derivatization steps and a separate analysis on GC. Our method does not have these requirements. It is indeed a single and very common extraction, after which each dried phase is reconstituted and immediately injected. But this simplicity is not a concession, as our metabolome coverage is easily more comprehensive than the other mentioned methods. We therefore feel that this simplicity should not discount our currently presented method, but be considered an additional advantage.

      Sequential extractions may be an option to consider. However, we feel like they are less user friendly and unneeded. Because we use internal standards, it is never an issue to pipet slightly more or less of any particular sample; making it easy to avoid the line between solvents.

      We will explicitly clarify where we already comply with the standards (such as the analysis of biological replicates and repeated injection of a QC sample) and are confident we can add figures and further information such as deposition of our data to comply with the rest.

      REFEREES CROSS-COMMENTING

      Completely agree with reviewer #1 comments, they are on point and I completely missed it. Relevant and should be addressed.

      Reviewers #2 points out work worth acknowledging, the internal standard work was quite thorough and well designed.

      Reviewer #3 and my comments overlap nicely, the need for further description of samples/replication and deposition of data in a metabolomics repository.

      Further work is required to make this a good publication and standard for the field, without this extra work addressing the reviewers comments I feel this work could be to certain degree misleading and/or incomplete putting in cause its publication potential.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #4

      Evidence, reproducibility and clarity

      Summary:

      The authors present a method for extraction of both lipid and polar metabolites from the model organism C. elegans. This extraction method is based on the well-established Blyth and Dyer method, with a slight modification to retain and utilize both the organic and non-polar fractions for LCMS analysis. They applied and tested this method against a monophasic extraction utilizing the same solvent system. They report that there is a loss of metabolites in the non-polar fraction to the polar fraction (of more polar metabolites) and small differences between the monophasic and biphasic extractions. They also expanded on the linearity of the extraction efficiency by increasing the number of worms. Further they applied the single extraction method to both knockdown mutants of C. elegans and Recombinant Inbred Lines derived from N2 and the natural isolate CB4856 to determine whether this method would still be able to differentiate the metabolome between the genetically different C. elegans populations.

      Major comments:

      Are the key conclusions convincing?

      As a whole the conclusions are convincing and valid.

      Should the authors qualify some of their claims as preliminary or speculative, or remove them altogether?

      The use of the adjective "robust" is, to an extent, erroneous. As defined, a robust method implies that the method is capable of withstanding small (deliberate or not) changes or variations. In this case the robustness of the method was not assessed and not clear how replication was carried out.

      Would additional experiments be essential to support the claims of the paper? Request additional experiments only where necessary for the paper as it is, and do not ask authors to open new lines of experimentation.

      Reproducibility would need to be assessed/quantified to establish how robust the method is. Even though linearity with an increase in the number of worms is a good indication, it does not satisfactorily establish the robustness of the method. The use of replicates to assess the agreement between measurements (i.e. bland-Altman plots), linearity as well as coefficients of variation (included in the sup material but not clear in the body of the manuscript) would characterize the methods best. The isolation of each variance originating from instrumental (pooled quality controls), biological (biological replication) and sample preparation (multiple extractions from the same biological source) is critical.

      Are the suggested experiments realistic in terms of time and resources? It would help if you could add an estimated cost and time investment for substantial experiments.

      The suggested experiments are in-fact just further analysis with the already collected data. There would be no need for further experiments, however it is not clear whether pooled QCs/or reference materials were used and the number of replicates per experimental design.

      Are the data and the methods presented in such a way that they can be reproduced?

      The methods are very well described. My only comment is to address how the replicates were grown/created and how many per strain/group. If the replicate measurements were done on the same samples (repeated injections), I believe that would weaken the findings (if not invalidate them altogether), however if these were biological replicates from independent starting populations the findings are valid and convincing.

      Are the experiments adequately replicated and statistical analysis adequate?

      As per my above comments.

      Minor comments:

      Specific experimental issues that are easily addressable.

      It is not clear how the sample preparation process was carried out (randomization, run order, QCs etc). As per the guidelines widely accepted from -

      Broadhurst, D., Goodacre, R., Reinke, S.N. et al. Guidelines and considerations for the use of system suitability and quality control samples in mass spectrometry assays applied in untargeted clinical metabolomic studies. Metabolomics 14, 72 (2018). https://doi.org/10.1007/s11306-018-1367-3.

      Are prior studies referenced appropriately?

      A major reference that has applied this extraction method before in the same model organism is missing:

      Castro, C., Sar, F., Shaw, W.R. et al. A metabolomic strategy defines the regulation of lipid content and global metabolism by Δ9 desaturases in Caenorhabditis elegans. BMC Genomics 13, 36 (2012). https://doi.org/10.1186/1471-2164-13-36

      Further a recent publication that goes beyond the work described by the authors using similar approach:

      MPLEx: a Robust and Universal Protocol for Single-Sample Integrative Proteomic, Metabolomic, and Lipidomic Analyses Ernesto S. Nakayasu, Carrie D. Nicora, Amy C. Sims, Kristin E. Burnum-Johnson, Young-Mo Kim, Jennifer E. Kyle, Melissa M. Matzke, Anil K. Shukla, Rosalie K. Chu, Athena A. Schepmoes, Jon M. Jacobs, Ralph S. Baric, Bobbie-Jo Webb-Robertson, Richard D. Smith, Thomas O. Metz mSystems May 2016, 1 (3) e00043-16; DOI: 10.1128/mSystems.00043-16

      Are the text and figures clear and accurate?

      Yes

      Do you have suggestions that would help the authors improve the presentation of their data and conclusions?

      The figures, overall are of exceptional quality. As per current scientific consensus, Box plots should also be overlaid with the actual datapoints (which was aptly done for the bar charts and other plots). The supplementary data even though comprehensive is hard to understand. A "readme" file detailing what data each file contains would improve readability and comply with FAIR principles.

      Significance

      Even though the approach is not novel and has long been used in Natural Products Chemistry and in other organisms, it's highly significant to set an extraction method standard for the field of C. elegans metabolomics (including myself doing metabolomics and natural products chemistry with LCMS and NMR). However, this manuscript does not cover the technical aspects of the method with sufficient depth to hallmark this method as the standard for the field. Further information is needed to fill the missing gaps (as highlighted by the authors). Ratios between solvent and biological material amounts, reproducibility, recovery rates (even though buried in the supplementary files) and metabolite coverage are still missing.

      As a side note, the disparity between the monophasic and biphasic extractions could be overcome by a sequential extraction of the same sample, with no incurred cost on performance (and removing the much-dreaded pipetting uncertainty near the line between solvents).

      The second aspect of the manuscript, which initially was a welcoming idea (and important), became >50% of the manuscript creating a disconnect between the information set by the abstract and introduction and the results/conclusion. The work is extremely relevant in both sections of the manuscript, but the technical aspect is still lacking details and/or analysis.

      Strongly suggested: explicit compliance with the minimum reporting standards as per the Metabolomics Standards Initiative (MSI) and deposition of the data to a metabolomics repository (i.e. Metabolights or Metabolomics Workbench). These are internationally accepted requirements for metabolomics publications.

      REFEREES CROSS-COMMENTING

      Completely agree with reviewer #1 comments, they are on point and I completely missed it. Relevant and should be addressed.

      Reviewers #2 points out work worth acknowledging, the internal standard work was quite thorough and well designed.

      Reviewer #3 and my comments overlap nicely, the need for further description of samples/replication and deposition of data in a metabolomics repository.

      Further work is required to make this a good publication and standard for the field, without this extra work addressing the reviewers comments I feel this work could be to certain degree misleading and/or incomplete putting in cause its publication potential.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      The manuscript is well written and consider. However, there is room for for further improvements,

      1) Author need to write exactly how many metabolites not just >, semi-quantitative analysis of >100 polar (metabolomics) and >1000 apolar (lipidomics) metabolites in C. elegans, for example they did with other papers in Table 1

      2)Authors also need to clarify on number of samples in the result section while describing the statistical analysis.

      Significance

      This might be interesting paper for the research community who work with C.elegans (metabolism or in general)

      The authors must deposit the raw data and make it available for the public,so they could also benefit from this good work.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      The authors provide a detailed description of a method to analyse both polar as well as lipophilic metabolites from the same nematode sample. This provides significant advantages over methods using individual samples. Moreover and by using internal standards they establish an extremely good correlation of individual metabolites. This paper is of immediate importance for the worms community and beyond.

      Major comments: none

      Minor comments:

      The correction process using internal standards could be described a bit more detailed.

      Jenni Watts has written a nice Worm Book chapter on lipids which may be cited in addition to reference 17, since it covers many of the metabolites and related enzymes contained in this manuscript

      Significance

      see above

    5. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Molenaars et al., describe a protocol to extract and quantify a wide range of polar and apolar metabolites from the same C. elegans sample using methanol-chloroform based phase separation. The authors assess the method across different input amounts, in comparison to a 1-phase extraction method and through metabolic perturbations using RNAi against several metabolic enzymes. Finally, they provide a metabolomics analysis of metabolite variation across several C. elegans strains. The data are of overall high quality and presented in a clearly written manuscript.

      To help assessing the value of the method to other approaches, several controls are suggested below:

      1.Fig.1: Metabolite abundance in the polar phase should be compared to 1-phase extraction methods (analogous to Fig. 2I, which compares metabolites in the apolar phase to 1-phase extraction)

      2.Are polar metabolites also detected in the apolar phase? Can the less hydrophobic lipids missing from the apolar phase detected in the polar phase?

      3.Fig.3l-n: The authors claim that extracting metabolites from the polar and apolar phases of the same sample leads to better cross-correlation than if metabolites are extracted from different samples using methods optimized for the respective metabolite classes. To provide experimental evidence, metabolite abundance should be compared directly when metabolites are extracted from the same or from different samples using suitable methods.

      Significance

      The methodological and conceptual advancement of the present study is rather incremental. The authors essentially use the classical chloroform/methanol/water phase separation protocols developed by Bligh & Dyer and Folch, which have been used extensively for lipid extraction for many decades now. However, the effort to carefully measure the metabolites contained in the aqueous phase is laudable. For method validation, the authors use well-understood perturbations that yield predictable results. Overall, I consider the study more appropriate for a publication as a methods protocol, which could be of interest to the metabolomics community, rather than as a research paper.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      We thank the reviewers for their feedback and encouragement. We have now fully revised the manuscript to address all comments. Our specific responses are provided below and we have highlighted changes in the text. The major additions are:

      • analysis of simulated time-courses with lower temporal resolution
      • analysis of ex vivo PER2::LUCIFERASE SCN recordings
      • analysis of simulated time-courses with Poisson distributions of noise
      • plotted summary statistics for several figures
      • mathematical formula and explanation in the Methods Overall, these revisions have strengthened our findings and improved the manuscript, particularly in demonstrating that the issues with the chi-square periodogram are not specific to sampling interval or data type.

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      **Summary:**

      Tackenberg & Hughey investigate the reliability of a popular period estimation algorithm, the chi-square periodogram. They find a bias in the estimation, and through careful investigation identify the cause. This is a well executed and well presented study.

      **Comments:**

      In Figs 2+3 the authors show that the discontinuity in periodogram coincides with the number of complete cycles, K. However, in Fig 2C there are several other positions where K abruptly changes, but little effect on the chi-squared statistic is observed. Can the authors offer an explanation as to why the magnitude of the discontinuities differ?

      We have taken a closer look at how each component of the chi-square statistic calculation changes at points where K decreases, and have found that discontinuities do always occur at these points. In addition to the obvious effect of the K * N term on the sudden decreases, we found that the sum of squares of the column means alone (the primary component of the numerator) also changes abruptly at each transition point of K. As a result, the discontinuity magnitude is likely roughly proportional to the amplitude of the chi-square statistic at that point.

      An important claim is that the discontinuity is observed in multiple software implementations. However, the plots of Supplementary Fig 1C,D are presented too small to evaluate this claim.

      In Supplemental Fig. 1C-D, the critical information is the shape of the periodogram and the presence of a discontinuity, so we believe the plot sizes are appropriate.

      It may be of interest to apply the algorithms to a single-cell experimental data set which are qualitatively different (e.g., oscillation shape, damping).

      We have created a new supplemental figure (Supplemental Fig. 8) by applying the strategy and visualization used in Fig. 6 to SCN PER2::LUC recordings instead of wheel-running data, and have updated the text accordingly.

      Reviewer #1 (Significance (Required)):

      It has been previously shown that the chi-square periodogram algorithm has performance shortcomings for the analysis of circadian data (e.g. Zielinski et al., 2004). However, this study demonstrates exactly why, giving more conclusive evidence to support the conclusion that it should be avoided. This will be useful to many in the mammalian circadian community. It should be noted however that other algorithms are already favoured by other ciock communities (e.g. plant), even if a rigorous understanding of the biases were lacking.

      The methods developed here will be valuable for future comparisons of circadian algorithms. Of particular importance will be comparing algorithms for analysis of single-cell rhythms or non-stationary rhythms.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      Chi-squared periodograms (CSP) are routinely used in circadian biology. In particular, this test has been used to determine circadian period in behavioral data (e.g. actigraphy) in mammals, flies and other species. This paper suggests that CSP, in some circumstances (e.g. where there are discontinuities), that CSP could be improved by changing the algorithm. They propose different steps to do this (e.g. using their greedy CSP code) and/or by using alternative tests such as Lomb-Scargle.

      The authors use simulated data to demonstrate their findings, and whilst I can see the benefits of this, it would be useful to benchmark the algorithms on actual real world circadian data (e.g. actograms from mouse or fly experiments). Although these types of data may not be publicly available, it would be highly likely to be available from multiple labs in the circadian field. In particular, fly datasets will be abundant in many clock labs. This would aid the utility of the papers findings for the field.

      Fig. 6 is entirely based on real-world circadian data (mouse wheel-running activity), as is the newly added Supplemental Fig. 8.

      Reviewer #2 (Significance (Required)):

      The paper is helpful for the circadian field when dealing with datasets that may contain discontinuities.

      It appears that the paper will be primarily useful for behavioral data, rather than, for example, transcriptomic time courses, since these tend to be much shorter and less sample intensive. Thus, it would be useful for circadian (and other) researchers analysing activity data in particular.

      My expertise is in circadian rhythms, both behavioural and molecular (e.g. sequencing) level analyses. Thus, I would be a possible end-user for the algorithms in this paper.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      **Summary:**

      The authors identify a serious flaw in a popular method called Chi-squared periodogram (CSP) for period estimation in circadian rhythms. They systematically get to the source of the problem -- a discontinuity in the test statistic. This flaw leads to a bias in the period estimate. They present two modifications to the CSP, one of which they prefer. Nevertheless, they show that other more flexible methods such as Lomb-Scargle Periodogram work well without this discontinuity (bias) issue.

      **Major Comments:**

      1.One thing the authors do not include is timeseries lengths of non-integer days. Would it not be an interesting suggestion to choose a non-integer length time course, which is not a multiple of the periods of interest, and still continue using CSP as is ? This is also rather counter-intuitive.

      Figs. 3A and 6 and newly added Supplemental Fig. 8 use non-integer (24-h) days.

      2.I suppose the authors use a sampling resolution of 6min with wheel-running activity in mind. But it would be worth it in the interest of completeness to also consider a lower resolution. There is nothing in this study that ties it to the specific application, is it not?

      Although a sampling resolution of 6 minutes is not specific to wheel-running activity, we have added an analysis identical to that of Fig. 5 but with a resolution of 20 minutes (Supplemental Fig. 5). Additionally, the PER2::LUC SCN recordings analyzed in Supplemental Fig. 8 have a sampling resolution of 20 minutes.

      3.The authors discuss only the mean absolute error in the text but isn't the direction (sign) of the error also of interest. As far as I can see in Fig 5, conservative CSP overestimates and greedy CSP generally underestimates periods.

      We discuss both the error (references to Fig. 5A) and absolute error (references to Fig. 5B) in the text. We feel the interpretation suggested by the reviewer may be too reliant on the results of 3-day simulations, as the apparent underestimation by greedy appears far less substantial in simulations of 6 and 12 days.

      **Minor Comments:**

      1.I would like to see the formulae for the ratio of variances and p-values to be clear about how the authors computed the CSP. They describe it in words already, but I think some mathematics is warranted here.

      We have added the formula for the standard chi-square periodogram to the Methods section.

      2.It is nice to the see the raw data in the plots. But I would like to see the plot of the summary statistics (mean and variance/st. dev) for each of scatter plots to judge the size of bias. It is not easy to do this with the Excel sheet.

      We have overlaid a black circle representing the median and a vertical black line representing the 5th-95th percentile range onto Fig. 5 and Supplemental Figs. 3-7.

      Reviewer #3 (Significance (Required)):

      The authors present a sobering perspective on the chi-squared periodogram, which is still very popular among empirical biologists. They plainly show using artificial data that it is better to avoid the CSP when possible, although they suggest improvements to the CSP. The authors provide an R package to perform the analysis.

      There have been previous work that have highlighted other limitations of the CSP. This might be considered one more nail in the coffin of the CSP.

      I think this paper would be interest to both computational biologists and wet-lab biologists, but I think it ought to have a greater influence on the latter as the former already resort to more sophisticated approaches.

      My expertise is in Computational and Theoretical biology.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Summary:

      The authors identify a serious flaw in a popular method called Chi-squared periodogram (CSP) for period estimation in circadian rhythms. They systematically get to the source of the problem -- a discontinuity in the test statistic. This flaw leads to a bias in the period estimate. They present two modifications to the CSP, one of which they prefer. Nevertheless, they show that other more flexible methods such as Lomb-Scargle Periodogram work well without this discontinuity (bias) issue.

      Major Comments:

      1.One thing the authors do not include is timeseries lengths of non-integer days. Would it not be an interesting suggestion to choose a non-integer length time course, which is not a multiple of the periods of interest, and still continue using CSP as is ? This is also rather counter-intuitive.

      2.I suppose the authors use a sampling resolution of 6min with wheel-running activity in mind. But it would be worth it in the interest of completeness to also consider a lower resolution. There is nothing in this study that ties it to the specific application, is it not?

      3.The authors discuss only the mean absolute error in the text but isn't the direction (sign) of the error also of interest. As far as I can see in Fig 5, conservative CSP overestimates and greedy CSP generally underestimates periods.

      Minor Comments:

      1.I would like to see the formulae for the ratio of variances and p-values to be clear about how the authors computed the CSP. They describe it in words already, but I think some mathematics is warranted here.

      2.It is nice to the see the raw data in the plots. But I would like to see the plot of the summary statistics (mean and variance/st. dev) for each of scatter plots to judge the size of bias. It is not easy to do this with the Excel sheet.

      Significance

      The authors present a sobering perspective on the chi-squared periodogram, which is still very popular among empirical biologists. They plainly show using artificial data that it is better to avoid the CSP when possible, although they suggest improvements to the CSP. The authors provide an R package to perform the analysis.

      There have been previous work that have highlighted other limitations of the CSP. This might be considered one more nail in the coffin of the CSP.

      I think this paper would be interest to both computational biologists and wet-lab biologists, but I think it ought to have a greater influence on the latter as the former already resort to more sophisticated approaches.

      My expertise is in Computational and Theoretical biology.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Chi-squared periodograms (CSP) are routinely used in circadian biology. In particular, this test has been used to determine circadian period in behavioral data (e.g. actigraphy) in mammals, flies and other species. This paper suggests that CSP, in some circumstances (e.g. where there are discontinuities), that CSP could be improved by changing the algorithm. They propose different steps to do this (e.g. using their greedy CSP code) and/or by using alternative tests such as Lomb-Scargle.

      The authors use simulated data to demonstrate their findings, and whilst I can see the benefits of this, it would be useful to benchmark the algorithms on actual real world circadian data (e.g. actograms from mouse or fly experiments). Although these types of data may not be publicly available, it would be highly likely to be available from multiple labs in the circadian field. In particular, fly datasets will be abundant in many clock labs. This would aid the utility of the papers findings for the field.

      Significance

      The paper is helpful for the circadian field when dealing with datasets that may contain discontinuities.

      It appears that the paper will be primarily useful for behavioral data, rather than, for example, transcriptomic time courses, since these tend to be much shorter and less sample intensive. Thus, it would be useful for circadian (and other) researchers analysing activity data in particular.

      My expertise is in circadian rhythms, both behavioural and molecular (e.g. sequencing) level analyses. Thus, I would be a possible end-user for the algorithms in this paper.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary:

      Tackenberg & Hughey investigate the reliability of a popular period estimation algorithm, the chi-square periodogram. They find a bias in the estimation, and through careful investigation identify the cause. This is a well executed and well presented study.

      Comments:

      In Figs 2+3 the authors show that the discontinuity in periodogram coincides with the number of complete cycles, K. However, in Fig 2C there are several other positions where K abruptly changes, but little effect on the chi-squared statistic is observed. Can the authors offer an explanation as to why the magnitude of the discontinuities differ?

      An important claim is that the discontinuity is observed in multiple software implementations. However, the plots of Supplementary Fig 1C,D are presented too small to evaluate this claim.

      It may be of interest to apply the algorithms to a single-cell experimental data set which are qualitatively different (e.g., oscillation shape, damping).

      Significance

      It has been previously shown that the chi-square periodogram algorithm has performance shortcomings for the analysis of circadian data (e.g. Zielinski et al., 2004). However, this study demonstrates exactly why, giving more conclusive evidence to support the conclusion that it should be avoided. This will be useful to many in the mammalian circadian community. It should be noted however that other algorithms are already favoured by other ciock communities (e.g. plant), even if a rigorous understanding of the biases were lacking.

      The methods developed here will be valuable for future comparisons of circadian algorithms. Of particular importance will be comparing algorithms for analysis of single-cell rhythms or non-stationary rhythms.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      We would like to thank the reviewers for their comments and suggestions. Our responses to them are listed below. We are hopeful that they will be satisfied with our responses and the changes we made in the revised version of the manuscript.

      REVIEWER #1


      Reviewer #1 (Evidence, reproducibility and clarity (Required)): In this manuscript, Ameen and colleagues report the results of a multidimensional proteomic analysis which combined quantitative proteomics, phosphoproteomics and N-terminomics in an effort to identify neuronal proteins displaying altered abundance or modifications by proteolysis and/or phosphorylation following an excitotoxic insult. Excitotoxicity is known to initiate by over-activation of ionotropic glutamate receptors which allows an increase in intracellular Ca2+ , ultimately leading to activation of proteases. The analysis revealed that glutamate treatment for up to 240 min did not significantly affect the abundance of neuronal proteins but caused dramatic changes in the phosphorylation state of many neuronal proteins. Based upon the phosphopeptides and neo-N-peptides, which contain the neo-N-terminal amino acid residue generated through proteolytic cleavage of intact neuronal proteins during excitotoxicity, the authors identified the proteins that undergo phosphorylation, dephosphorylation and/or enhanced proteolytic processing in excitotoxic neurons. By combining different software packages, they found that these modified proteins form complex interactions that affect signaling pathways regulating survival, synaptogenesis, axonal guidance and mRNA processing. These data suggest that perturbations in the aforementioned pathways mediate excitotoxic neuronal death. Then, the authors showed by Western blot analysis that CRMP2, a crucial regulator of axonal guidance signaling, exhibited enhanced truncation and reduced phosphorylation at specific sites upon glutamate treatment. These events may contribute to injury to dendrites and synapses associated with excitotoxic neuronal death. Furthermore, the authors showed that calpains are responsible for the proteolytic processing and cathepsins for enhanced degradation of proteins during excitotoxicity. Blockage of calpain-mediated cleavage site of the tyrosine kinase Src during excitotoxicity confers neuroprotection in an in vivo model of neurotoxicity. In that regard, over twenty protein kinases are predicted to be activated in excitotoxic neurons. Collectively, this study contributes to the construction of an atlas of phosphorylation and proteolytic processing events that occur during excitotoxicity and as such they can be targeted for therapeutic purposes.

      **Comments** Comment: The identification of potential calpain cleavage sites in neuronal proteins modified during excitotoxicity is an interesting finding of the study. However, the atlas presented appears to miss components such as Kinase D-interacting substrate of 220 kDa (Kidins220), also known as ankyrin repeat-rich membrane spanning (ARMS), a protein recently shown to be cleaved by calpain during excitotoxicity (López-Menéndez et al, 2019, Cell Death and Disease 10, 535).

      Response: The calpain cleavage site of neuronal ARMS/KIDINS220 was mapped to the peptide bond between Asn-1669 and Arg-1670 (Gamir-Morralla, et al. (2015) Cell Death & Diseases 6, e1939). The cleavage is expected to generate two truncated fragments – one of ~185 kDa and another of ~10 kDa at the N-terminal and C-terminal sides, respectively of the cleavage site. Our TAILS analysis failed to detect the 10 kDa fragment which contains the neo-N-terminus generated by calpain cleavage. Here are the possible explanations:

      The neo-N-terminus of the 10 kDa C-terminal fragment is unlikely to be observed in our experiment as the TAILS method relies on the production of peptides by trypsin. The 10 kDa fragment has Arginine as the first amino acid which means that the N-terminal peptide released and isolated by the TAILS method would be a single amino acid. In their publication, Gamir-Morralla, et al. showed that the total levels of both intact and degraded ARMS/Kidins220 decreased as a result of ischemic cerebral stroke, suggesting degradation rather than proteolytic processing to generate stable truncated fragments as the final outcome of calpain cleavage of ARMS/Kidins220 (Figure 2b of the publication by Gamir-Morralla, et al.). The TAILS method predominantly detects proteolytic processing whereas degradation can be more difficult to capture. Degradation often results in peptides containing less than 5-6 amino acids that are difficult to align with a single protein or result in transient peptide that may not be detectable in neurons at 240 min after glutamate treatment. **Overall, it is possible that Kidins220 is generated but was undetected by the TAILS approach.


      Comments: The CRMP2 antibody (Cell Signalling, 35672) used for western blots (figure 5D, also figure S11) and immunofluorescence (figure 5E) is problematic. Copied from https://www.cellsignal.com/products/primary-antibodies/crmp-2-d8l6v-rabbit-mab/35672: Monoclonal antibody is produced by immunizing animals with a synthetic peptide corresponding to residues surrounding lle546 of human CRMP-2 protein. The truncated CRMP2 (figure 5D) studied in the whole section (residues 1-516 or 1-517, ~57kDa) cannot be recognized by this monoclonal antibody. The detected band with the red letters in figure 5D might represent another cleavage product. In any case, asking Cell Signalling for more information about the exact immunogen might help, but since it's monoclonal and derived from residues surrounding lle546 it's very hard to include residues before aa516 and the unique epitope recognition upstream of aa516. The whole result section and discussion has to be reconsidered. Alternatively another antibody can be used to repeat those experiments in order to support the hypothesis. Time and resources are very familiar to authors since they have to repeat their previous work with a new antibody. Finally, there are no "western blot" and "immunofluorescence" methods for CRMP2.

      Response: We would like to apologise for incorrectly listing the catalogue number of the anti-CRMP2 antibody purchased from Cell Signalling technology. Rather than the rabbit monoclonal anti-CRMP2 antibody (Cell Signalling, Cat#: 35672), we used the polyclonal anti-CRMP2 antibody (Cell Signalling, Cat#9393) to perform all the Western blot and immunofluorescence analysis in this paper. The e-mail confirming the purchase of this antibody is appended. According to the vendor, the antibody was raised by immunizing rabbits with a synthetic peptide derived from the human CRMP2 sequence. We decided to order this antibody because Zhang, et al. (Sci Rep. 2016; 6: 37050) reported that it could detect the truncated CRMP2 fragments generated by calpain cleavage in primary cortical neurons in vitro in response to axonal damage.

      *The procedures of Western blot and immunofluorescence detailing the correct CRMP2 antibody descriptions are added in the revised version of the submitted manuscript.

      *


      Comment: The truncated DCLK1 bands detected in figure S8B cannot be attributed to the proteolytic processing of DCLK1 at the sites described: T311↓S312, S312↓S313 and N315↓G316 (predicted M.W. of the (C-terminal) products: 48.7-49.1kDa (figure S8A) which is very close to be well-separated with conventional PAGE). The number and the separation of the bands suggest other cleavage sites. Response: We agree with the reviewer’s comment that conventional SDS-PAGE cannot differentiate the proteolytic products generated by cleavage at the three sites identified by TAILS. Furthermore, the TAILS methods could not detect all peptides generated by a protein during proteolysis. Therefore, validating our results with a Western blot experiment may reveal unidentified peptides in certain cases. We have now added the following statement in the revised manuscript to reflect the presence of other cleavage sites: “Besides detecting the 50-56 kDa truncated fragments, the antibody also cross-reacted with several truncated fragments of ~37-45 kDa. These findings suggest that DCLK1 underwent proteolytic processing at multiple other sites in addition to the three cleavage sites identified by our TAILS analysis.

      Comment: Could the striking observation that almost all proteolytic processing during excitotoxicity is catalyzed by calpains and/or cathepsins have derived (partially) from unspecific targets of calpeptin such as a subset of tyrosine phosphatases (Schoenwaelder and Burridge, 1999: approx. 1h treatment of fibroblasts with approx.. 10x less concentration) or other(s)? Response: Schoenwaelder and Burridge (1999, JBC 274:14359) reported that calpeptin exhibits both protease inhibitor as well as a protease inhibitor-independent activities in fibroblasts. Besides inhibiting calpains and cathepsins, they demonstrated that calpeptin could selectively inhibit a subset of membrane-bound tyrosine phosphatases. Since the TAILS method monitored the protease inhibitor activity of calpeptin, the proteolytically processing events mitigated by calpeptin in neurons during excitotoxicity are likely attributed to its protease inhibitor activity. Additionally, Schoenwaelder and Burridge reported this unconventional protease inhibitor-independent activity of calpeptin in fibroblasts. Since the protein tyrosine kinases expressed in neurons and fibroblasts are different, it is unclear if calpeptin can also exert such activity in neurons.

      Comment: Describing the final part of figure 4C the authors suggest that "Liver kinase B1 homolog (LKB1), CaM kinase kinase β (CaMKKβ) and transforming growth factor‐β‐activating kinase 1 (TAK1) are the known upstream kinases directly phosphorylating T172 of AMPKα to activate AMPK (Herrero-Martin et al., 2009; Woods et al., 2005; Woods et al., 2003). Our findings therefore predict activation of these kinases during excitotoxicity (Figure 4C)." The first question arising here is whether these three kinases are the only ones know to phosphorylate AMPKα. Even if this is true, it is highly speculative to suggest that the findings of the present study predict the activation of these kinases during excitotoxicity, without providing the necessary experimental data, since the increased phosphorylation of AMPK may be an indirect effect of the reduced function of a phosphatase. Thus the proposed model does not hold. Response: Agree. We have therefore revised our interpretation of the results to reflect this possibility. The Revised sentence on page 13 reads “**Liver kinase B1 homolog (LKB1), CaM kinase kinase β (CaMKKβ) and transforming growth factor‐β‐activating kinase 1 (TAK1) are the known upstream kinases directly phosphorylating T172 of AMPKα to activate AMPK (Herrero-Martin et al., 2009; Woods et al., 2005; Woods et al., 2003), while a member of the metal-dependent protein phosphatase (PPM) family could dephosphorylate T172 of AMPK in cells (Garcia-Haro et al., 2010). Our findings therefore predict activation of these kinases and/or inactivation of the PPM family phosphatase in neurons during excitotoxicity (Figure 4C).”

      Additionally, we also deleted the schematic diagram depicting the possibility of activation of LKB1, CaMKKβ and TAK1 in Figure 4 of the revised manuscript.

      __**Minor points**

      __

      Minor Comment: Highlights could present the key points of the study in a more straightforward manner. Response: Agree. We have edited the highlights in our revised manuscript to make them more straightforward.


      Minor comment: Figure 4A is too complicated. Proteins considered as hubs of signaling pathways in neurons should be somehow highlighted to distinguish them.

      Response: Agree. We have now highlighted the signalling hubs by shading them in green in the revised figure. As we merged figures 2 and 4 of the original manuscript, these signalling hubs are presented in Figure 2B of the revised manuscript.

      Minor Comment: The analysis of proteins with enhanced truncation and reduced phosphorylation such as CRMP2 and DCLK1 is fragmented. In addition, the authors should mention the criteria based on which these proteins were selected for further analysis.

      Response: IPA analysis revealed synaptogenesis and axonal guidance as the top-ranked perturbed canonical signalling pathways governed by neuronal proteins undergoing significantly increased proteolytic processing and altered phosphorylation. As CRMP2 and DCLK1 are the key players in these pathways, they were chosen for further biochemical analysis to validate the TAILS results. To address this point, we added a few statements in the sections describing results of biochemical analysis of CRMP2 and DCLK1 in the revised manuscript. The additional sentences on page 13 now read “IPA analysis of the significantly modified neuronal proteins identified in our study predicted perturbation of signalling pathways governing axonal guidance and synaptogenesis in neurons during excitotoxicity (Figure S7). Since CRMP2 (also referred as DPYSL2) is a key player in neuronal axonal guidance and synaptogenesis (Evsyukova et al., 2013) and it underwent significant changes in phosphorylation state and proteolytic processing (Figures 5A and S7), it was chosen for validation of our proteomic results.” The additional sentences on page 15 read ”Similar to CRMP2, DCLK1 is also a key player in regulation of axonal guidance and synaptogenesis (Evsyukova et al., 2013). Since our TAILS results revealed significant proteolytic processing of DCLK1 (Figure S8A), it was chosen for validation of our proteomic results.”

      • *

      Minor comment: The potential therapeutic relevance of phosphorylation and proteolytic processing events that occur during excitotoxicity can be further explored. Response: Thanks for the suggestion. We have added a paragraph describing the additional evidence that protein kinase inhibitors and cell-permeable inhibitors blocking calpain cleavage of specific neuronal proteins as potential neuroprotectants to reduce brain damage induced by ischemic stroke. The additional sentences near the end of the Discussion section (page 25) now read Since CRMP2 is key player in axonal guidance and synaptogenesis revealed by our proteomic analysis as the most perturbed cellular processes in excitotoxicity, blockade of its cleavage to form the truncated CRMP fragment is another potential neuroprotective strategy. Indeed, a cell-permeable Tat-CRMP2 peptide encompassing residues 491-508 close to the identified cleavage sites of CRMP2 could block calpain-mediated cleavage of neuronal CRMP2 and protect neurons against excitotoxic cell death (Yang et al., 2016)**.”

      • *

      The additional paragraph at the end of the Discussion section (page 25) now reads: “Besides the neuronal proteins undergoing enhanced proteolytic processing during excitotoxicity, protein kinases predicted by our phosphoproteomic results to be activated during excitotoxicity are also targets for the development of neuroprotective drugs. For example, our results demonstrated significant activation of neuronal AMPK during excitotoxicity, suggesting that aberrant activation of AMPK can contribute to neuronal death. Of relevance, small-molecule AMPK inhibitors could protect against neuronal death induced by ischemia in vitro, and brain damages induced by ischemic stroke in vivo. Likewise, inhibitors of Src and other Src-family kinases were known to protect against neuronal loss in vivo in a rat model of in traumatic brain injury (Liu et al., 2008a; Liu et al., 2017). Future investigation of the role of the excitotoxicity-activated protein kinases in excitotoxic neuronal death will reveal if small-molecule inhibitors of these kinases are potential neuroprotective drug candidates.”

      • *

      • *

      Minor comment: I am sorry but I could not find Figure 8, which is supposed to show the "In vivo model of NMDA neurotoxicity" (please, see page 30).

      Response: Our apology for the mistake. This should be Figure 6 of the revised manuscript.

      Minor comment: Introduction: O'Collins et al., 2006; Savitz and Fisher, 2007; both references are missing.

      Response:* This was an oversight from our part and the references have been added to the revised manuscript.**

      *

      Minor comment: Figure S1A-B: vehicle treatment time course is needed. Response: All neurons were cultured in neurobasal media for seven days. The control neurons were incubated in culture media while we started treating the other neurons with glutamate for MTT and LDH assay. The additional paragraph describing the design of the cell viability/death assays in page 32 reads “Primary cortical neurons were incubated for 480 min with and without the addition of 100 μM of glutamate. The control neurons were incubated for 480 min in culture medium. For neurons treated with glutamate for 30 min, 60 min, 120 min and 240 min, they were pre-incubated in culture medium for 450 min, 420 min, 360 min and 240 min, respectively prior to the addition of glutamate to induce excitotoxicity. For neurons treated with glutamate for 480 min, they were treated with glutamate just after seven days of culture in neurobasal media.”

      • *

      Minor comment: Figure 5E: Control close-up is missing. Response: A close-up view of the control neurons is now provided in Figure 4E of the revised manuscript.

      *

      *

      Minor comment: "Moreover, the number of CRMP2-containing dendritic blebs in neurons at 240 min of glutamate treatment was significantly higher than that in neurons at 30 min of treatment (inset of Figure 5E)." Such a statistic is not shown in the graph. Response: The statistical analysis results are now added to the revised manuscript in Figure 5E.

      • *

      Minor comment: "Consistent with this prediction, our bioinformatic analysis revealed that the identified cleavage sites in most of the significantly degraded neuronal proteins during excitotoxicity are mapped within functional domains with well-defined three-dimensional structures (Figures 6A)." Authors might mean figure S12A? Response: Correct. Our apology for the mislabelling. This has been corrected to “S12A”in the revised manuscript.

      Minor comment: "Neuronal Src was identified by the three criteria of our bioinformatic analysis to be cleaved by calpains to form a stable truncated protein fragment during excitotoxicity (Figures 6A and Table S6)." Authors might mean figure 6D?

      Response: Correct. Our apology for the mislabelling. Since we merged figures 2 and 4 of the original manuscript. This has been corrected to now read “(Figure 5D)” on page 18 of the revised manuscript.

      Minor comment: Figure 2B: Clusters 1, 3, 4 and 6 do not follow treatment trends homogenously in all time points. For example in cluster 1 there is a phosphopeptide following the pattern 1, 0, -1 and another one following the pattern 0, 1, -1, which is actually a very different pattern even if the end value is stable (-1). The first example could belong to the cluster 6 as well, while the second example to cluster 5. Please elaborate on the rationale behind the categorization. Is there any other clustering method that can be used without making the categorization more complicated? Response: Since we merged Figures 2 and 4 of the original manuscript. This comment relates to the right panel of Figure 2A of the revised manuscript. The rationale behind the categorization of the phosphopeptides into six clusters was based upon the patterns of changes of their abundance (i.e. average of log-2 normalized z-score of phosphopeptide intensity) in three sample groups. **We calculated the number of permutations where the number of sample groups in set (n) = 3 (i.e. Control neurons, neurons of 30 min glutamate treatment and neurons of 240 min glutamate treatment) and number of sample groups in each permutation (r) = 3 (i.e. all three sample groups should be present in each permutation). Hence the number of permutations is 6. The six clusters refer to the six possible permutations of the patterns of abundance changes of the identified phosphopeptides rather than the end results.

      Minor comment: A problem of the manuscript is its length and lack of coherence. Apart from presenting the data from the proteomics, phosphoproteomics and N-terminomics analyses, the authors focus on several different proteins to perform validation experiments and further characterize the biological significance of their modification. Because these proteins do not fall on the same pathway, the authors end up presenting several independent stories that complicate the reader. Response: We agree that proteins that do not operate in the same signalling pathway were chosen for further biochemical analysis. Their choice was justified because they are key players in the most perturbed canonical signalling pathways identified by bioinformatic analysis with the IPA software. We agree that this may complicate the reader. However, it also helps to illustrate that excitotoxic neuronal death is a complicated cell death process caused by dysregulation of multiple neuronal proteins which regulate different cellular processes.

      Minor comment: Moreover, it is necessary for the authors to restructure their introduction, and avoid over-representing previous research on nerinetide, which is not used anywhere in the manuscript. Instead, the introduction must be more focused to better capture the necessity and essence of the present study. Response: We agree. Based on the reviewer’s comments, we decided to restructure the introduction by shortening the description of the results of Nerinetide research. Please refer to the track changes of the revised manuscript for the changes.

      Minor comment: Taking into account figures 1 and S2 I understand that the authors combined samples of neuronal cell cultures (treated or not with Glu) with samples from mouse brains (that have undergone ischemic stroke/TBI or sham operation). If this is the case, why did the authors do that? How did they combine the different samples? And why this is not mentioned anywhere is the main text? Response: For a data-independent acquisition (DIA) based mass spectrometry experiment, it is essential we generate a library of identifiable peptides first using a standard data-dependent acquisition (DDA) approach. For the DIA type experiment to work, the identified peptides have to be in that library first. Excitotoxicity is a major mechanism of neuronal loss caused by ischemic stroke and traumatic brain injury. We therefore included the brains of sham-operated mice, brains of mice suffering ischemic stroke and traumatic brain injury to construct the spectral libraries and that is why the library contains pooled samples from the representative samples. Pre-fractionation of the pooled peptides was also performed to increase the number of identifiable peptides and generate a deeper library.

      • Once we generated that library, all samples are analysed individually as a separate DIA experiment. The DIA approach then makes use of the generated library for identification and quantitation. This methodology allows for deeper identification and lower number of missing values. These statements were added in the method section of the revised manuscript (page 33)*

      Minor comment: Regarding figure 5D, the authors write in the main text "Consistent with our phosphoproteomic results, the truncated fragment CRMP2 fragments could not cross-react with the anti-pT509 CRMP2 antibody (Figure 5D)" In the upper blot the truncated CRMP2 fragment runs well below the 70 kDa marker. However, in the middle panel, where we see the blot with the phospho specific antibody, the respective area of the blot has been cropped, so we cannot see whether the truncated fragment cross-reacts with the phospho specific antibody. Response: The presentation of the western blots in Figure 5D in the revised manuscript are now less cropped and clearly demonstrate there is no cross reactivity of the phospho specific antibody with the truncated fragment. Please refer to the revised Figure 5 for the updated Western blot images.

      Minor comment: It is strange that only 1 and 13 proteins showed significant changes in abundance at 30 and 240min respectively. Especially after 240min of glutamate treatment one could expect that many proteins should change in their levels, since the neurons are almost diminished by cell death at that point. How could the authors explain this phenomenon? Additionally, in their previous publication, they showed that much more proteins change significantly in abundance following glutamate treatment (at 30min and 240min).

      Response: Even though our global spectral libraries contain over 49,000 identifiable peptides derived from 6524 proteins, only 1696 quantifiable proteins were identified in the DIA mass spectrometry analysis (Figure 1) because we used stringent criteria for their identification: (i) false discovery rate of We agree with the reviewer that many more proteins are expected to change their abundance at 240 min as significant cell death was detected. However, if we had used less stringent false discovery rates of their identification and quantification, included proteins with just one unique identified peptide and lowered the threshold of abundance fold changes, many more proteins with significantly changed abundance would be detected. But we preferred to use these stringent criteria to ensure a high confidence in our identification of neuronal proteins undergoing significant changes during excitotoxicity.*

      • *

      • *

      In agreement with the low number of neuronal proteins exhibiting significant changes in abundance reported in this manuscript, our previously published study (Hoque, et al. (2019) Cell Death & Diseases) detected only 26 neuronal proteins undergoing changes in abundance. Hence, we disagree with the reviewer that our previous publication reported much more proteins undergoing changes in abundance in excitotoxicity.

      Reviewer #1 (Significance (Required)): Comment on significance: The manuscript delivers a large amount of data, regarding changes in the proteome, the activation of specific kinases, phosphatases, as well as the molecular pathways that are activated at distinct time points of excitotoxicity. This information could be used in future studies to validate and develop potential therapeutic strategies that could protect against neuronal loss in various neurological disorders. Response: We are excited that Reviewer #1 felt that this large amount of generated data will be useful for subsequent studies to validate and develop novel therapeutic strategies.

      Comment on significance: The same group has very recently published a work very similar to the particular manuscript (Hoque et al. Cell Death and Disease, 2019). In their previous publication, the authors cover a large part of their current objectives. They performed again a proteomic and phosphoproteomic analysis of mouse primary cortical neurons treated with glutamate for distinct time points, in their aim to identify changes in expression and phosphorylation state of neuronal proteins upon excitotoxicity. Apart from the N-terminome, which they investigate in their current manuscript, the proteomic and phospho proteomic analysis are very similar. As such, and because of the fact that the current manuscript is very extensive, the authors should consider to minimize it, and include only their novel findings (changes in the N-terminome, the involvement of specific kinases that contribute to excitotoxic neuronal death, the regulatory mechanism of CRMP2, etc).

      Response: Since the coverage of phosphoproteins undergoing changes in neurons during excitotoxicity identified in the current study is much higher than that of phosphoproteins identified in our previously published study, we prefer to retain the description of the phosphoproteomic findings in this manuscript. Nonetheless, we agree that the manuscript needs to be shortened. Our suggestions to shorten the manuscript are listed below:

      1. Move the description and results of global proteomic analysis to supplementary information. Since we made the same observation that only a small number of neuronal proteins undergo significant changes in abundance during excitotoxicity in our previously published study, moving the global proteomic analysis results away from the main text will not adversely impact the quality of the presentation.
      2. For the description of how we classified the identified N-terminal peptides as those derived from degradation and those derived from proteolytic processing, we would like to move it to the supplementary information. Comment on significance: The authors should describe in a simpler way the proteomic and bioinformatics analyses they are using in the manuscript. It is difficult to understand the methodology used if you are not an expert in proteomics and bioinformatics. My suggestion is to revise their text and make it simpler and more concise. Response: We agree with this criticism. As we are not allowed to make a major revision of the manuscript at this stage, the revised manuscript contains only minor revisions that addresses all of the comments and suggestions provided by the two reviewers. Further changes will be added in the next revised version. Our suggestions to further restructure the manuscript are listed below:

      Figure S5 depicting the rationale for classification of N-terminal peptides as products of degradation and those of proteolytic processing will be moved to the main text. The description of the rationale in the main text will be revised to help readers who are not experts in proteomics to better understand the rationale. A diagram depicting the workflow of our TAILS method will be added as a supplementary figure. For bioinformatic analysis of the proteomic results, we will provide in the supplementary information the definition of the following terms relevant to Ingenuity Pathway Analysis and PhosphoPath analysis of the perturbed biological processes and signalling pathways: (a) Canonical Signalling Pathways, (b) Cellular Processes and (c) Interaction Networks. A short description of how their identification benefits the mapping of the neurotoxic signalling networks in neurons will be provided in the supplementary information.

      • *

      • *

      REVIEWER #2


      Reviewer #2 (Evidence, reproducibility and clarity (Required)): Comment: In this article, Ameen and collaborators identify the modified proteins during neuronal excitotoxicity by using an in vitro model in which mouse primary cortical neurons are treated 30 and 240 min with 100 µM Glutamate. They use different approaches: a quantitative label-free global and phospho-proteomic methods and a quantitative N-terminomic procedure called Terminal Amine Isotopic Labelleling of Subtrates (TAILS). Results show that 240 min glutamate has minimal impact on protein abundance (13 neuronal proteins show significant changes) but enhance a modification of phosphorylation state and proteolysis of nearly 900 proteins. A significant part of these proteins are involved signalling pathway involved in cell survival, synaptogenesis and axonal guidance.

      The paper is globally well written and experiments are convincing. The methodology and the analysis are well described and well explain. The text and each figure are clear and accurate. However, I have just one comment that needs answers and/or clarifications. Thanks for your work. Response: We appreciate the compliment provided by this reviewer on our submitted manuscript.

      **Minor comment:**

      Minor comment: Primary neurons are used at DIV7 and it has been shown that at DIV7 the percentage of astrocytes is relatively low, however astrocytes plays a key role in glutamate recapture and release. It will be relevant to know the percentage of glial cell in the culture model of the authors and how astrocytes are involved in glutamate recapture and also in excitotoxicity.

      Response: The compositions of the DIV7 cultures are: 94.1+/- 1.1 % neurons, 4.9%+/-1.1% astrocytes, and *

      Reviewer #2 (Significance (Required)):

      Comment on significance: Excitotoxicity is a cell death process involved in many neurological disorders. However, nowadays, there are no existent FDA-approved pharmacological agents targeted to protect against excitotoxicity leading to neuronal death. A better comprehension of excitotoxicity is required to improve prevention, therapy and reparation following the disease.

      With this work, the authors highlighted modified proteins in excitotoxic neurons. Interestingly, few of these proteins are involved in cell survival, mRNA processing or axonal guidance. This atlas of phosphorylation and proteolytic processing events during excitotoxicity permit the identification of new therapeutic targets such as calpain-mediated cleavage of Src kinase. This atlas will interest a lot of team working on neurological disorders such as Alzheimer disease, Parkinson disease or stroke. It will permit to better characterize cellular/molecular events involved in neuronal loss and will permit to find new therapeutic targets. Response: In response to this comment and a similar comment by Reviewer 1, we expanded the discussion to include the potential therapeutic values of our findings.

      Comment on significance: My field of expertise: Stroke, cell death, excitotoxicity, signalling pathways and molecular targets, autophagy. I don't have sufficient expertise to evaluate proteomic analysis.

      Response: No response is needed.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      In this article, Ameen and collaborators identify the modified proteins during neuronal excitotoxicity by using an in vitro model in which mouse primary cortical neurons are treated 30 and 240 min with 100 µM Glutamate. They use different approaches: a quantitative label-free global and phospho-proteomic methods and a quantitative N-terminomic procedure called Terminal Amine Isotopic Labelleling of Subrates (TAILS). Results show that 240 min glutamate has minimal impact on protein abundance (13 neuronal proteins show significant changes) but enhance a modification of phosphorylation state and proteolysis of nearly 900 proteins. A significant part of these proteins are involved signalling pathway involved in cell survival, synaptogenesis and axonal guidance.

      The paper is globally well written and experiments are convincing. The methodology and the analysis are well described and well explain. The text and each figure are clear and accurate. However, I have just one comment that needs answers and/or clarifications. Thanks for your work.

      Minor comment:

      Primary neurons are used at DIV7 and it has been shown that at DIV7 the percentage of astrocytes is relatively low, however astrocytes plays a key role in glutamate recapture and release. It will be relevant to know the percentage of glial cell in the culture model of the authors and how astrocytes are involved in glutamate recapture and also in excitotoxicity.

      Significance

      Excitotoxicity is a cell death process involved in many neurological disorders. However, nowadays, there are no existent FDA-approved pharmacological agents targeted to protect against excitotoxicity leading to neuronal death. A better comprehension of excitotoxicity is required to improve prevention, therapy and reparation following the disease.

      With this work, the authors highlighted modified proteins in excitotoxic neurons. Interestingly, few of these proteins are involved in cell survival, mRNA processing or axonal guidance. This atlas of phosphorylation and proteolytic processing events during excitotoxicity permit the identification of new therapeutic targets such as calpain-mediated cleavage of Src kinase. This atlas will interest a lot of team working on neurological disorders such as Alzheimer disease, Parkinson disease or stroke. It will permit to better characterize cellular/molecular events involved in neuronal loss and will permit to find new therapeutic targets.

      My field of expertise: Stroke, cell death, excitotoxicity, signalling pathways and molecular targets, autophagy. I don't have sufficient expertise to evaluate proteomic analysis.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      In this manuscript, Ameen and colleagues report the results of a multidimensional proteomic analysis which combined quantitative proteomics, phosphoproteomics and N-terminomics in an effort to identify neuronal proteins displaying altered abundance or modifications by proteolysis and/or phosphorylation following an excitotoxic insult. Excitotoxicity is known to initiate by over-activation of ionotropic glutamate receptors which allows an increase in intracellular Ca2+ , ultimately leading to activation of proteases. The analysis revealed that glutamate treatment for up to 240 min did not significantly affect the abundance of neuronal proteins but caused dramatic changes in the phosphorylation state of many neuronal proteins. Based upon the phosphopeptides and neo-N-peptides, which contain the neo-N-terminal amino acid residue generated through proteolytic cleavage of intact neuronal proteins during excitotoxicity, the authors identified the proteins that undergo phosphorylation, dephosphorylation and/or enhanced proteolytic processing in excitotoxic neurons. By combining different software packages, they found that these modified proteins form complex interactions that affect signaling pathways regulating survival, synaptogenesis, axonal guidance and mRNA processing. These data suggest that perturbations in the aforementioned pathways mediate excitotoxic neuronal death. Then, the authors showed by Western blot analysis that CRMP2, a crucial regulator of axonal guidance signaling, exhibited enhanced truncation and reduced phosphorylation at specific sites upon glutamate treatment. These events may contribute to injury to dendrites and synapses associated with excitotoxic neuronal death. Furthermore, the authors showed that calpains are responsible for the proteolytic processing and cathepsins for enhanced degradation of proteins during excitotoxicity. Blockage of calpain-mediated cleavage site of the tyrosine kinase Src during excitotoxicity confers neuroprotection in an in vivo model of neurotoxicity. In that regard, over twenty protein kinases are predicted to be activated in excitotoxic neurons. Collectively, this study contributes to the construction of an atlas of phosphorylation and proteolytic processing events that occur during excitotoxicity and as such they can be targeted for therapeutic purposes.

      Comments

      The identification of potential calpain cleavage sites in neuronal proteins modified during excitotoxicity is an interesting finding of the study. However, the atlas presented appears to miss components such as Kinase D-interacting substrate of 220 kDa (Kidins220), also known as ankyrin repeat-rich membrane spanning (ARMS), a protein recently shown to be cleaved by calpain during excitotoxicity (López-Menéndez et al, 2019, Cell Death and Disease 10, 535).

      The CRMP2 antibody (Cell Signalling, 35672) used for western blots (figure 5D, also figure S11) and immunofluorescence (figure 5E) is problematic. Copied from https://www.cellsignal.com/products/primary-antibodies/crmp-2-d8l6v-rabbit-mab/35672: Monoclonal antibody is produced by immunizing animals with a synthetic peptide corresponding to residues surrounding lle546 of human CRMP-2 protein. The truncated CRMP2 (figure 5D) studied in the whole section (residues 1-516 or 1-517, ~57kDa) cannot be recognized by this monoclonal antibody. The detected band with the red letters in figure 5D might represent another cleavage product. In any case, asking Cell Signalling for more information about the exact immunogen might help, but since it's monoclonal and derived from residues surrounding lle546 it's very hard to include residues before aa516 and the unique epitope recognition upstream of aa516. The whole result section and discussion has to be reconsidered. Alternatively another antibody can be used to repeat those experiments in order to support the hypothesis. Time and resources are very familiar to authors since they have to repeat their previous work with a new antibody. Finally, there are no "western blot" and "immunofluorescence" methods for CRMP2.

      The truncated DCLK1 bands detected in figure S8B cannot be attributed to the proteolytic processing of DCLK1 at the sites described: T311↓S312, S312↓S313 and N315↓G316 (predicted M.W. of the (C-terminal) products: 48.7-49.1kDa (figure S8A) which is very close to be well-separated with conventional PAGE). The number and the separation of the bands suggest other cleavage sites.

      Could the striking observation that almost all proteolytic processing during excitotoxicity is catalyzed by calpains and/or cathepsins have derived (partially) from unspecific targets of calpeptin such as a subset of tyrosine phosphatases (Schoenwaelder and Burridge, 1999: approx. 1h treatment of fibroblasts with approx.. 10x less concentration) or other(s)?

      Describing the final part of figure 4C the authors suggest that "Liver kinase B1 homolog (LKB1), CaM kinase kinase β (CaMKKβ) and transforming growth factor‐β‐activating kinase 1 (TAK1) are the known upstream kinases directly phosphorylating T172 of AMPKα to activate AMPK (Herrero-Martin et al., 2009; Woods et al., 2005; Woods et al., 2003). Our findings therefore predict activation of these kinases during excitotoxicity (Figure 4C)." The first question arising here is whether these three kinases are the only ones know to phosphorylate AMPKα. Even if this is true, it is highly speculative to suggest that the findings of the present study predict the activation of these kinases during excitotoxicity, without providing the necessary experimental data, since the increased phosphorylation of AMPK may be an indirect effect of the reduced function of a phosphatase. Thus the proposed model does not hold.

      Minor points

      Highlights could present the key points of the study in a more straightforward manner.

      Figure 4A is too complicated. Proteins considered as hubs of signaling pathways in neurons should be somehow highlighted to distinguish them.

      The analysis of proteins with enhanced truncation and reduced phosphorylation such as CRMP2 and DCLK1 is fragmented. In addition, the authors should mention the criteria based on which these proteins were selected for further analysis.

      The potential therapeutic relevance of phosphorylation and proteolytic processing events that occur during excitotoxicity can be further explored.

      I am sorry but I could not find Figure 8, which is supposed to show the "In vivo model of NMDA neurotoxicity" (please, see page 30).

      Introduction: O'Collins et al., 2006; Savitz and Fisher, 2007; both references are missing.

      Figure S1A-B: vehicle treatment time course is needed.

      Figure 5E: Control close-up is missing.

      "Moreover, the number of CRMP2-containing dendritic blebs in neurons at 240 min of glutamate treatment was significantly higher than that in neurons at 30 min of treatment (inset of Figure 5E)." Such a statistic is not shown in the graph.

      "Consistent with this prediction, our bioinformatic analysis revealed that the identified cleavage sites in most of the significantly degraded neuronal proteins during excitotoxicity are mapped within functional domains with well-defined three-dimensional structures (Figures 6A)." Authors might mean figure S12A?

      "Neuronal Src was identified by the three criteria of our bioinformatic analysis to be cleaved by calpains to form a stable truncated protein fragment during excitotoxicity (Figures 6A and Table S6)." Authors might mean figure 6D?

      Figure 2B: Clusters 1, 3, 4 and 6 do not follow treatment trends homogenously in all time points. For example in cluster 1 there is a phosphopeptide following the pattern 1, 0, -1 and another one following the pattern 0, 1, -1, which is actually a very different pattern even if the end value is stable (-1). The first example could belong to the cluster 6 as well, while the second example to cluster 5. Please elaborate on the rationale behind the categorization. Is there any other clustering method that can be used without making the categorization more complicated?

      A problem of the manuscript is its length and lack of coherence. Apart from presenting the data from the proteomics, phosphoproteomics and N-terminomics analyses, the authors focus on several different proteins to perform validation experiments and further characterize the biological significance of their modification. Because these proteins do not fall on the same pathway, the authors end up presenting several independent stories that complicate the reader.

      Moreover, it is necessary for the authors to restructure their introduction, and avoid over-representing previous research on nerinetide, which is not used anywhere in the manuscript. Instead, the introduction must be more focused to better capture the necessity and essence of the present study.

      Taking into account figures 1 and S2 I understand that the authors combined samples of neuronal cell cultures (treated or not with Glu) with samples from mouse brains (that have undergone ischemic stroke/TBI or sham operation). If this is the case, why did the authors do that? How did they combine the different samples? And why this is not mentioned anywhere is the main text?

      Regarding figure 5D , the authors write in the main text "Consistent with our phosphoproteomic results, the truncated fragment CRMP2 fragments could not cross-react with the anti-pT509 CRMP2 antibody (Figure 5D)" In the upper blot the truncated CRMP2 fragment runs well below the 70 kDa marker. However, in the middle panel, where we see the blot with the phospho specific antibody, the respective area of the blot has been cropped, so we cannot see whether the truncated fragment cross-reacts with the phospho specific antibody.

      It is strange that only 1 and 13 proteins showed significant changes in abundance at 30 and 240min respectively. Especially after 240min of glutamate treatment one could expect that many proteins should change in their levels, since the neurons are almost diminished by cell death at that point. How could the authors explain this phenomenon? Additionally, in their previous publication, they showed that much more proteins change significantly in abundance following glutamate treatment (at 30min and 240min).

      Significance

      The manuscript delivers a large amount of data, regarding changes in the proteome, the activation of specific kinases, phosphatases, as well as the molecular pathways that are activated at distinct time points of excitotoxicity. This information could be used in future studies to validate and develop potential therapeutic strategies that could protect against neuronal loss in various neurological disorders.

      The same group has very recently published a work very similar to the particular manuscript (Hoque et al. Cell Death and Disease, 2019). In their previous publication, the authors cover a large part of their current objectives. They performed again a proteomic and phosphoproteomic analysis of mouse primary cortical neurons treated with glutamate for distinct time points, in their aim to identify changes in expression and phosphorylation state of neuronal proteins upon excitotoxicity. Apart from the N-terminome, which they investigate in their current manuscript, the proteomic and phospho proteomic analysis are very similar. As such, and because of the fact that the current manuscript is very extensive, the authors should consider to minimize it, and include only their novel findings (changes in the N-terminome, the involvement of specific kinases that contribute to excitotoxic neuronal death, the regulatory mechanism of CRMP2, etc).

      The authors should describe in a simpler way the proteomic and bioinformatics analyses they are using in the manuscript. It is difficult to understand the methodology used if you are not an expert in proteomics and bioinformatics. My suggestion is to revise their text and make it simpler and more concise.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      The manuscript describes two advances. First is the technical development for a protein targeting system called PInT that brings a target protein close to (~320 bp) a DNA sequence of interest. The idea is that localisation of the target protein allows one to distinguish its effects on the DNA sequence either in cis (when targeted) or in trans (when not targeted but expressed at the same level). Since targeting is conveyed by simply adding the small molecule ABA to the experiment, it is easy to compare the two situations. This is a clever idea and it is substantiated by data showing that the components of PInT do not affect triplet repeat instability or gene expression of GFP, into whose gene the PInT system is placed. Moreover, targeting is shown to enable enzymatic activity in the targeted region. Using the DNA methylase DNMT1, there are local increases in DNA methylation. Similarly, targeting the histone deacetylase HDAC5 results in local decreases in histone H3 acetylation.

      We thank the reviewer for a thoughtful and helpful review.

      What is not clear from these experiments, however, is whether the targeted proteins can interact normally with partner proteins to form functional complexes. One necessary control is to add ChIP for at least one interacting protein each for DNMT1 and for HDAC5 and show that targeting permits normal protein-protein interactions. This experiment is straightforward as specific interacting proteins are known and good antibodies to precipitate those proteins are available.

      This is a good suggestion and we plan on doing this experiment in our 59B-Y-HDAC5 and 89B-Y-DNMT1 lines with and without ABA using interacting proteins. The exact interacting protein to be used will depend on the antibodies availability and quality, which we will test. We will start with UHRF1 and HDAC3 for PYL-Dnmt1 and PYL-HDAC5, respectively.

      Overall, PInT would likely be useful for many groups studying the effects of chromatin modifiers on a DNA sequence of interest.

      The second advance is conceptual and is focused more specifically on triplet repeat expansions. The manuscript describes experiments that measure genetic instability of long CAG-CTG repeats with and without protein targeting. The results show that allele size distributions are not significantly affected by targeting either DNMT1 or HDAC5. One curious outcome that is not discussed is contraction frequency in the HDAC5 experiment. Zero contractions are reported compared to 10-20% contractions in the other two experiments. Authors need to provide an explanation.

      Lack of contractions in this experiment is likely due to the lower number of repeats in this line (59 vs 89/91). It is known that longer repeats display higher frequency of contractions, and contractions are rarely seen in short repeats (Larson et al Neurobiology of Disease 2015, Gomes-Pereira et al PLOS Genet 2007, Morales et al HMG 2020). Albeit, the threshold may be different in our HEK293-derived cells. Of note, we had a clone of 89B-Y-HDAC5 that did not express the expected amount of GFP for unknown reasons and we did not use it here. However, small pool PCRs using this line with 89 repeats showed that contractions were indeed present. Although we cannot rule out that the reason for the contractions is the unknown mutation(s), it suggests that the difference is due to the size of the expansion. We have added a comment in the methods section.

      It reads: “We have noted that cell lines with repeats that are mildly expanded (e.g., 59 CAGs) have fewer contractions than longer ones. This is consistent with several studies in the context of DM1 and HD [82], albeit the size threshold for seeing more contractions may be shorter in HEK293-derived cells than in mice.”

      The major issue with this set of experiments is that there is no positive control where instability is shown to be clearly manipulated. A knockdown of FAN1 would be the most likely avenue to pursue for identifying a positive control. This is straightforward to perform since successful FAN1 knockdowns have been described in the literature.

      We agree that a positive control to show that the model behaves as expected is necessary. We will add the experiments proposed by the reviewer in the revised version of the manuscript.

      The manuscript also looks at effects on gene expression measured by GFP fluorescence intensity. The potential significance is to see if disease-causing genes with expanded triplet repeats can be silenced by targeting chromatin-modifying enzymes. In the examples tested here, the answer seems to be no. Expression of DNMT1 or HDAC5 reduce fluorescence even in the absence of targeting. Upon targeting, there is a small further decrease, but the expanded triplet repeat resists this further decrease. Domain analysis of HDAC5 indicates that protein-protein interactions, not deacetylase activity, are important for silencing. The key interaction may be with HDAC3, since small molecule inhibition of HDAC3 relieved repeat length-dependent silencing by HDAC5. It was very curious that targeting HDAC3 actually increased expression, instead of silencing. The explanation for this observation was inadequate.

      We have added the following paragraph to the discussion to address this.

      It reads: “We found that targeting of PYL-HDAC3 increases gene expression slightly, independently of repeat size and in the presence of an inhibitor of its catalytic activity. Although this appears counterintuitive, several studies suggest that this is not unexpected. Specifically, HDAC3 has an essential role in gene expression during mouse development that is independent of its catalytic activity [73]. Moreover, HDAC3 binds more readily to genes that are highly expressed in both human and yeast cells [74,75]. The mechanism or function of HDACs binding to highly expressed genes are currently unknown.”

      The claim on page 16 final paragraph that the manuscript 'settled a central question for both HDAC5 and DNMT1 and their involvement in CAG/CTG repeat instability' is not supported by the data. Most of the results are negative so it is premature to claim the question is 'settled'.

      We have rephrased all the conclusions about this in the text, emphasizing that we find no evidence of a role in cis, rather than stating that there is no role in cis.

      Overall, with appropriate modifications described here, these experiments would be of interest with regards to potential therapies of triplet repeat expansion diseases, where silencing the expanded gene is the goal.

      **Minor concerns**

      P 4, last line. 59 bp should read 59 repeats - This is now fixed.

      P 5, line 2. 38 bp of what? This is now amended. It reads: “The CAG/CTG repeats affect splicing of the reporter in a length-dependent manner, with longer repeats leading to more robust insertion of an alternative CAG exon that includes 38 nucleotides downstream of the CAG, creating a frameshift [30].”

      P 10, first paragraph. DNA methylation levels rise from ~10% to ~20% with DNMT1 targeting. Is there a good precedent in the literature that the magnitude of this increase can be expected to be biologically meaningful?

      To our knowledge, it is the first time that DNMT1 is used for targeted epigenome editing. This is therefore the first evidence that targeting DNMT1 leads to silencing of a reporter construct. Nevertheless, this reviewer’s comment stands: is an increase in DNA methylation of 10 to 20% biologically relevant? The answer to this is yes, changes in 10-20% are known to have functional impact on gene expression in various settings (for example see the recent study in developing oocytes by Li et al Nature 2018). Furthermore, there is evidence that DNMT1 has weak de novo activity (Li et al Nature 2018, Wang et al Nat Genet 2020), consistent with a small increase in CpG methylation upon targeting. We now acknowledge in the discussion that one reason for the lack of effect upon targeting may be that the changes in CpG methylation are not dramatic enough. We also point out more clearly that changes of 10 to 20% are correlated with changes in repeat instability (Dion et al HMG 2008). We have amended the text to reflect this.

      The results now reads “To do so, we performed bisulfite sequencing after targeting PYL-DNMT1 for 30 days. This led to changes of 10 to 20% in the levels of CpG methylation, a modest increase(Fig. 3C), which is in line with the weak de novo methyltransferase activity of DNMT1 (for example see [39,40]). Similar changes in levels of CpG methylation in Dnmt1 heterozygous ovaries and testes were seen to correlate with changes in repeat instability in vivo [31].”

      The discussion now states: “It should be pointed out that there remains the possibility that DNMT1 targeting did not lead to large enough changes in CpG methylation to affect repeat instability.”

      P12 first paragraph. Text describing Fig 5 is confusing. First, GFP expression is referred to in terms of fold decrease, but subsequently in percent. Second, the ABA-induced silencing looks to reduce expression from about 0.6 to 0.5 of control. I presume this is where the claim of 16% comes from but it was not clear. Indeed, this is what we mean.

      We now state: “In 16B-Y-DNMT1 cells, ABA treatment decreased GFP expression by 2.2-fold compared to DMSO treatment alone. Surprisingly, ABA-induced silencing was 1.8 fold compared to DMSO alone, or 16% less efficient in 89B-Y-DNMT1 than in 16B-Y-DNMT1 cells.”

      P 15 paragraph 2. Where does the P value of 0.78 come from? Fig 7B shows no corresponding value. The P-value in figure 7B has now been corrected.

      Reviewer #1 (Significance (Required)):

      See above.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      **Summary:**

      We still do not know whether epigenetics contributes to repeat instability and/or transcriptional activity in unstable CAG/CTG repeat associated pathologies. The aim of this manuscript is to examine whether induced binding of DNMT1 (CpG methylation) or HDAC5 (histone H3 acetylation) modulates CAG/CTG repeat instability and/or gene silencing upon expansion. For this the authors developed a highly sophisticated reporter system (PlnT) that allows to recruit a specific chromatin modifying enzyme (DNMT1/ HDAC5) to a GFP reporter near a CAG/CTG expansion, in the course of transcription (Dox-inducible promoter). This is to determine whether the CTGs, when lengthened and transcribed, become unstable or impede gene activity via epigenetic modifications.

      We appreciate the reviewer highlighting the importance of the question that we address here and the usefulness of PInT.

      **Findings:**

      1.Binding of DNMT1 to the reporter results in a modest increase (~10%) in local DNA methylation, with no change in repeat instability.

      3.Targeting HDAC5 to the reporter results in local reduction in histone H3 acetylation, with no effect on repeat stability.

      4.DNMT1/HDAC5 binding reduces GFP intensity differentially, in normal but not expanded alleles.

      5.The N-terminal domain of HDAC5, when mutated, abolishes the reduction in GFP expression levels.

      6.RGFP966 abolishes the allele-specific effect of HDAC5, resulting in a general decrease in GFP expression regardless of repeat tract size

      7.CTG expanded alleles abolish the reduction in GFP repression by HDAC5 via HDAC3 activity

      **Conclusions:**

      Based on the results using the PlnT reporter assay, the authors claim that:

      1.HDAC5 and DNMT1 do not affect repeat instability in cis

      2.Expanded CAG/CTGs reduce the efficiency of gene silencing by targeting DNMT1/HDAC5 to the locus

      3.Gene silencing that is mediated by HDAC5 recruitment can be abolished by inhibition of HDAC3 activity

      Unfortunately, none of the claims in this manuscript are convincing.

      We note that in the comments below the reviewer does not include a reason why he/she does not find the claims convincing. We therefore cannot address this criticism.

      **General Comments:**

      The major drawback of the PlnT experimental approach is that it ignores the importance of the flanking regions and the genomic organization of the endogenous locus. This is a major concern as it makes the conclusions irrelevant to the related loci. In the case of myotonic dystrophy type 1, for example, the reporter should reside within a CpG island, should be positioned immediately next to CTCF binding site(s), and should be transcribed bi-directionally.

      HDAC3 and DNMT1 were found to have effects on repeat instability both at reporters, which do not harbour flanking sequences from disease loci, and indeed at endogenous loci in vivo (Dion et al HMG 2008, Debacker et al PLoS Biol 2012, Suelves et al Sci Rep 2017, Williams et al PNAS 2020). This highlights the fact that cis elements from disease loci are not required for chromatin modifiers to affect repeat instability.

      The reviewer is suggesting a very interesting set of experiments where specific sequences may be added to our reporter and tested for their influence on gene expression and on repeat instability. PInT is ideally suited for this and we have now added a paragraph highlighting this in the discussion. We have also highlighted that the current study aims to isolate the repeats from its cis-elements to specifically side-step potential locus-specific effects and to look for chromatin modifiers that would be useful for epigenome editing for as many loci as possible.

      Furthermore, only large expansions (at least several hundred copies) can trigger heterochromatin at the DM1 locus. None of these features are recapitulated by the PlnT reporter assay, making it difficult to draw any conclusion regarding the role of these chromatin modifying enzymes to the locus.

      This is true for DM1 but untrue for other disease loci. For example, we have shown that there are changes in the flanking chromatin marks at the SCA1 locus of a mouse model with 145 repeats (Dion et al HMG 2008), DNA methylation is also affected near a SCA7 transgene with 92 CAG repeats (Libby et al PLoS Genet 2008) and transgenes containing CAG repeats (without the flanking sequences) lead to silencing regardless of where the transgene is integrated in the genome (Saveliev et al Nature 2003). Moreover, HDAC5 had effects on repeat expansion in a cell-based shuttle system containing as few as 22 CAG repeats (Gannon et al NAR 2012), again suggesting that chromatin modifiers affect repeat instability in a wide range of repeat sizes. We have reviewed this in Dion and Wilson TiG 2009.

      In fact, the authors state in their Discussion that "targeting a chromatin modifying peptide to different loci can have very different effects"!

      This is indeed the case and the reason why we sought to control for locus-specific effects using an exogenous reporter.

      To better substantiate their conclusions the authors must set up an improved model system that takes into account the flanking regions and the 3D genomic organization of the locus (TADs). The preferable approach would be to insert a reporter cassette by homologous recombination into the differentially methylated/acetylated regions near the repeats, and compare between normal vs. expanded alleles.

      We would like to point out that we have recently published a study where we looked at 3D chromatin folding at the DM1, HD, and the GFP transgene used here. We did not find any evidence for changes in TADs that would underlie changes in repeat instability at these loci (Ruiz Buendia et al Sci Advances 2020). We therefore do not think that it would be important to further manipulate 3D genomic organization in this context.

      To be clear, we are not denying that cis elements are likely to have an effect, there is plenty of evidence supporting this. Rather, we are using a reporter assay to disentangle the potential locus-specific (or cis-element specific) effects from the trans-activating factors. In short, we focus on the trans-acting factors rather than on the cis-elements, as suggested by the reviewer.

      We believe that the addition of the following paragraph highlights the goal of our study and also bring in the idea that cis acting elements can be studied using PInT.

      It now reads:

      “We designed PInT specifically to isolate expanded repeats tracts from other potential locus-specific cis elements. This is helpful to identify factors that would affect instability and/or gene expression across several diseases. Moreover, both HDAC3 and DNMT1 were found to impact repeat instability at different loci, including at reporter genes [31,33,36,37,45]. These observations highlight that cis-acting elements from disease loci are not required by chromatin modifiers to affect repeat instability. A potential application of PInT includes cloning in specific cis elements, including CTCF binding sites and CpG islands, next to the repeat tract and evaluate their effects on instability with or without targeting. In fact, PInT can be used to clone any sequence of interest near the targeting site and can be applied for a wide array of applications, beyond the study of expanded CAG/CTG repeats.”

      My impression was that there is a lot of data but none of it makes sense.

      The focus of the manuscript is not entirely clear: it starts with monitoring the effect of epigenetics on repeat instability and gene activity, then it shifts to the mechanism by which HDAC5 functions, and ends with the allele-specific effect of HDAC5 on gene expression. I lost my train of thought.

      We have now improved the transitions in this new version of this manuscript. Specifically, at the core of this manuscript is the development of PInT, which is highly versatile and allowed us to study multiple aspects of expanded CAG/CTG repeat biology. We hope that it is now clearer.

      **Other concerns:**

      (1)the modest increase in methylation levels following DNMT1 recruitment (10%, reaching a total of 20% at the most) prevents from drawing any conclusions regarding the effect of methylation on stability or expression.

      As mentioned in the response to reviewer 1 above, although 10% to 20% of CpG methylation are associated with changes in gene expression in a variety of settings, we now point out that one reason for the lack of effect in cis is that the de novo activity of DNMT1 is too weak to produce an effect.

      (2)The effect of protein targeting on GFP levels should be better defined at the RNA/protein level. Does it act by blocking transcription? alternative splicing? or alters steady state levels?

      Although the exact mechanism remains unclear, this goes beyond the current scope of this study. All these possibilities remain possible as we pointed out in the discussion.

      (3)Fig 5: the scale is different for A vs. B and C. Also, better to compare the effect of targeting on equal sized expansions (either 91, 89 or 58 repeats).

      We have fixed the scale on the figures.

      Unfortunately, it is not possible to have the same repeat sizes for all the cell lines because by their very nature, repeats are unstable. We have added a note relating to this in the methods.

      It reads: “Notably, it is not possible to obtain several stable lines with the exact same repeat size as they are, by their nature, highly unstable. This is why we have lines with different repeat sizes. Furthermore, the sizes can change over time and upon thawing.”

      (4)Add asterix for significance in all figures.

      This has now been done.

      (5)Figure 6: show raw data rather than normalized.

      We have now added representative flow cytometry profiles for each construct as a new supplementary figure (S5).

      (6)Figure 7: there is a notable difference in GFP expression levels in untreated wild type control (16 CAG repeats) between A vs. B. Why?

      Fig. 7a shows PYL targeting only, whereas 7b shows the GFP expression upon PYL-HDAC5 targeting. The values for PYL-HDAC5 targeting are lower because targeting it, unlike targeting PYL alone, silences the reporter.

      (7)Avoid redundancy. No need to show schematic representations so many times.

      We believe that the schematics make it clearer for the reader.

      Reviewer #2 (Significance (Required)):

      REFEREES CROSS-COMMENTING

      I totally agree with the Reviewer #1 that the PinT targeting system is a potent experimental tool to study the function of specific chromatin binding proteins. However, the significance of the flanking regions is discounted.

      We hope it is now clear that we are not discounting the potential significance of flanking regions and that rather we have designed the system to avoid their potentially complicating effects.

      The fact that the recruitment of HDAC5 has resulted in a significant reduction in acetylated histones provides evidence for that "the targeted proteins can interact normally with partner proteins to form functional complexes". Still, I agree with that the activity of DNMT1 needs to be better established, considering the minor increase in DNA methylation levels.

      We will be using ChIP against interacting proteins of DNMT1 and HDAC5 to address this issue.

      The request for a positive control for repeat instability is totally correct.

      We will be adding this in the revised manuscript.

      It is difficult to discuss the missing effect of HDAC5 on contractions or the unexpected effect of HDAC3 on gene silencing bearing in mind the limits of the experimental system.

      There is no expectation for the effect of HDAC5 on contractions as this has not been studied in any system yet. However, we believe that there is no contractions not because of HDAC5 per se but rather because of the shorter repeat size this line has (see comment to reviewer 1 above). We have now addressed the “unexpected effect” of HDAC3 by citing a number of studies finding a similar evolutionary conserved effect (see comment to Reviewer 1 above).

      I also agree with the statement that "this manuscript settled a central question for both HDAC5 and DNMT1 and their involvement in CAG/CTG repeat instability", is not supported by the data.

      We have now rephrased our conclusions. In this particular case, we changed ‘settled’ to ‘addressed’. We have also rephrased this in the results headings.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary:

      We still do not know whether epigenetics contributes to repeat instability and/or transcriptional activity in unstable CAG/CTG repeat associated pathologies. The aim of this manuscript is to examine whether induced binding of DNMT1 (CpG methylation) or HDAC5 (histone H3 acetylation) modulates CAG/CTG repeat instability and/or gene silencing upon expansion. For this the authors developed a highly sophisticated reporter system (PlnT) that allows to recruit a specific chromatin modifying enzyme (DNMT1/ HDAC5) to a GFP reporter near a CAG/CTG expansion, in the course of transcription (Dox-inducible promoter). This is to determine whether the CTGs, when lengthened and transcribed, become unstable or impede gene activity via epigenetic modifications.

      Findings:

      1.Binding of DNMT1 to the reporter results in a modest increase (~10%) in local DNA methylation, with no change in repeat instability.

      3.Targeting HDAC5 to the reporter results in local reduction in histone H3 acetylation, with no effect on repeat stability.

      4.DNMT1/HDAC5 binding reduces GFP intensity differentially, in normal but not expanded alleles.

      5.The N-terminal domain of HDAC5, when mutated, abolishes the reduction in GFP expression levels.

      6.RGFP966 abolishes the allele-specific effect of HDAC5, resulting in a general decrease in GFP expression regardless of repeat tract size

      7.CTG expanded alleles abolish the reduction in GFP repression by HDAC5 via HDAC3 activity

      Conclusions:

      Based on the results using the PlnT reporter assay, the authors claim that:

      1.HDAC5 and DNMT1 do not affect repeat instability in cis

      2.Expanded CAG/CTGs reduce the efficiency of gene silencing by targeting DNMT1/HDAC5 to the locus

      3.Gene silencing that is mediated by HDAC5 recruitment can be abolished by inhibition of HDAC3 activity

      Unfortunately, none of the claims in this manuscript are convincing.

      General Comments:

      The major drawback of the PlnT experimental approach is that it ignores the importance of the flanking regions and the genomic organization of the endogenous locus. This is a major concern as it makes the conclusions irrelevant to the related loci. In the case of myotonic dystrophy type 1, for example, the reporter should reside within a CpG island, should be positioned immediately next to CTCF binding site(s), and should be transcribed bi-directionally. Furthermore, only large expansions (at least several hundred copies) can trigger heterochromatin at the DM1 locus. None of these features are recapitulated by the PlnT reporter assay, making it difficult to draw any conclusion regarding the role of these chromatin modifying enzymes to the locus. In fact the authors state in their Discussion that "targeting a chromatin modifying peptide to different loci can have very different effects"! To better substantiate their conclusions the authors must set up an improved model system that takes into account the flanking regions and the 3D genomic organization of the locus (TADs). The preferable approach would be to insert a reporter cassette by homologous recombination into the differentially methylated/acetylated regions near the repeats, and compare between normal vs. expanded alleles.

      My impression was that there is a lot of data but none of it makes sense.

      The focus of the manuscript is not entirely clear: it starts with monitoring the effect of epigenetics on repeat instability and gene activity, then it shifts to the mechanism by which HDAC5 functions, and ends with the allele-specific effect of HDAC5 on gene expression. I lost my train of thought.

      Other concerns:

      (1)the modest increase in methylation levels following DNMT1 recruitment (10%, reaching a total of 20% at the most) prevents from drawing any conclusions regarding the effect of methylation on stability or expression.

      (2)The effect of protein targeting on GFP levels should be better defined at the RNA/protein level. Does it act by blocking transcription? alternative splicing? or alters steady state levels?

      (3)Fig 5: the scale is different for A vs. B and C. Also, better to compare the effect of targeting on equal sized expansions (either 91, 89 or 58 repeats).

      (4)Add asterix for significance in all figures.

      (5)Figure 6: show raw data rather than normalized.

      (6)Figure 7: there is a notable difference in GFP expression levels in untreated wild type control (16 CAG repeats) between A vs. B. Why?

      (7)Avoid redundancy. No need to show schematic representations so many times.

      Significance

      REFEREES CROSS-COMMENTING

      I totally agree with the Reviewer #1 that the PinT targeting system is a potent experimental tool to study the function of specific chromatin binding proteins. However, the significance of the flanking regions is discounted. The fact that the recruitment of HDAC5 has resulted in a significant reduction in acetylated histones provides evidence for that "the targeted proteins can interact normally with partner proteins to form functional complexes". Still, I agree with that the activity of DNMT1 needs to be better established, considering the minor increase in DNA methylation levels. The request for a positive control for repeat instability is totally correct. It is difficult to discuss the missing effect of HDAC5 on contractions or the unexpected effect of HDAC3 on gene silencing bearing in mind the limits of the experimental system. I also agree with the statement that "this manuscript settled a central question for both HDAC5 and DNMT1 and their involvement in CAG/CTG repeat instability", is not supported by the data.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      The manuscript describes two advances. First is the technical development for a protein targeting system called PInT that brings a target protein close to (~320 bp) a DNA sequence of interest. The idea is that localisation of the target protein allows one to distinguish its effects on the DNA sequence either in cis (when targeted) or in trans (when not targeted but expressed at the same level). Since targeting is conveyed by simply adding the small molecule ABA to the experiment, it is easy to compare the two situations. This is a clever idea and it is substantiated by data showing that the components of PInT do not affect triplet repeat instability or gene expression of GFP, into whose gene the PInT system is placed. Moreover, targeting is shown to enable enzymatic activity in the targeted region. Using the DNA methylase DNMT1, there are local increases in DNA methylation. Similarly, targeting the histone deacetylase HDAC5 results in local decreases in histone H3 acetylation. What is not clear from these experiments, however, is whether the targeted proteins can interact normally with partner proteins to form functional complexes. One necessary control is to add ChIP for at least one interacting protein each for DNMT1 and for HDAC5 and show that targeting permits normal protein-protein interactions. This experiment is straightforward as specific interacting proteins are known and good antibodies to precipitate those proteins are available. Overall, PInT would likely be useful for many groups studying the effects of chromatin modifiers on a DNA sequence of interest.

      The second advance is conceptual and is focused more specifically on triplet repeat expansions. The manuscript describes experiments that measure genetic instability of long CAG-CTG repeats with and without protein targeting. The results show that allele size distributions are not significantly affected by targeting either DNMT1 or HDAC5. One curious outcome that is not discussed is contraction frequency in the HDAC5 experiment. Zero contractions are reported compared to 10-20% contractions in the other two experiments. Authors need to provide an explanation. The major issue with this set of experiments is that there is no positive control where instability is shown to be clearly manipulated. A knockdown of FAN1 would be the most likely avenue to pursue for identifying a positive control. This is straightforward to perform since successful FAN1 knockdowns have been described in the literature. The manuscript also looks at effects on gene expression measured by GFP fluorescence intensity. The potential significance is to see if disease-causing genes with expanded triplet repeats can be silenced by targeting chromatin-modifying enzymes. In the examples tested here, the answer seems to be no. Expression of DNMT1 or HDAC5 reduce fluorescence even in the absence of targeting. Upon targeting, there is a small further decrease, but the expanded triplet repeat resists this further decrease. Domain analysis of HDAC5 indicates that protein-protein interactions, not deacetylase activity, are important for silencing. The key interaction may be with HDAC3, since small molecule inhibition of HDAC3 relieved repeat length-dependent silencing by HDAC5. It was very curious that targeting HDAC3 actually increased expression, instead of silencing. The explanation for this observation was inadequate. The claim on page 16 final paragraph that the manuscript 'settled a central question for both HDAC5 and DNMT1 and their involvement in CAG/CTG repeat instability' is not supported by the data. Most of the results are negative so it is premature to claim the question is 'settled'. Overall, with appropriate modifications described here, these experiments would be of interest with regards to potential therapies of triplet repeat expansion diseases, where silencing the expanded gene is the goal.

      Minor concerns

      P 4, last line. 59 bp should read 59 repeats

      P 5, line 2. 38 bp of what?

      P 10, first paragraph. DNA methylation levels rise from ~10% to ~20% with DNMT1 targeting. Is there a good precedent in the literature that the magnitude of this increase can be expected to be biologically meaningful?

      P12 first paragraph. Text describing Fig 5 is confusing. First, GFP expression is referred to in terms of fold decrease, but subsequently in percent. Second, the ABA-induced silencing looks to reduce expression from about 0.6 to 0.5 of control. I presume this is where the claim of 16% comes from but it was not clear.

      P 15 paragraph 2. Where does the P value of 0.78 come from? Fig 7B shows no corresponding value.

      Significance

      See above.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1

      Summary

      The authors present well written work on the evolution of proteome size and complexity, and the corresponding changes in chaperone proteins. Interestingly, they find chaperone copy numbers increase linearly with proteome size, despite the increasing 'complexity' of, in particular, post-LECA genomes. They suggest that to address the rise in complexity, organisms express chaperones at higher levels and an expanding network of co-chaperones has evolved across the tree of life.

      Major comments

      Comment-1. Summary reads strangely relative to the rest of the manuscript, and lists facts in a way that makes the purpose of the study confusing. I think most readers will dislike the characterisation of evolution as a progress from simple to complex, and the authors' might want to avoid this language throughout the manuscript- bacteria and archaea have also been evolving over this period of times, and have not become more 'complex'? Similarly the authors should reconsider their figure legend titles. As a specific example, 'in the course of evolution' should become 'across the tree of life'.

      Response

      Thank you for these crucial suggestions. We agree with the reviewer, and with Reviewer 2 (see below) that bacteria and archaea have also been evolving since their emergence, so basically, we (humans) and the simplest archaea have the same evolutionary origin. However, we all agree that the simplest archaea/bacteria are far more similar to LUCA than we are. That said, we accept the criticism that putting our analysis in the context of evolutionary time is an over-interpretation given that we have not examined the protein/proteome phylogeny (in relation to proteome complexity; for chaperones we have). We have thus reformulated the figures and text, to a comparison across the Tree of life, rather than a time-dependent evolutionary process. Specifically: as a first step, we revised the Figures to rename the X-axis as “Order of divergence”, rather than “Divergence time (million years)” in the previous version. In the revised main text we emphasized the fact that the branch lengths of the Tree of Life represent the relative order of divergence of the different clades, rather than time. All instances of ‘in the course of evolution’ has been replaced by ‘across the Tree of Life’.

      Secondly, we revised the main text to emphasize on prokaryote vs. eukaryote comparison, rather than comparing organisms that diverged at different time-points. Within bacterial and archaeal domains, proteomes do not seem to expand against the order of divergence (as the reviewer argued, bacteria and archaea have not become more complex, also see Comment-5).

      Thirdly, the word ‘complexity’ has been omitted from the manuscript. The section “The expansion of proteome complexity” now reads as “Proteome expansion by de novo innovations”. In the previous version, increasing complexity in fact implied a torrent of de novo innovations that impose a larger burden on the chaperone machinery. Instead of ‘complexity’, the latter is clearly stated in the revised manuscript.

      In the spirit of these changes, the title of the revised manuscript, figure legend titles, and related section titles have been edited as follows.

      Submitted version

      Revised version

      Paper title. On the evolution of chaperones and co-chaperones and the exponential expansion of proteome complexity

      On the evolution of chaperones and co-chaperones and the expansion of proteomes across the Tree of Life

      Section title. A Tree of Life analysis of the expansion of proteome complexity and chaperones

      A Tree of Life analysis of the expansion of proteomes and chaperones

      Section title. The expansion of proteome size

      The expansion of proteome size across the Tree of Life

      Section title. The expansion of proteome complexity

      Proteome expansion by de novo innovations

      Figure 1 legend title. Expansion of proteome size

      Expansion of proteome size across the Tree of Life

      Figure 2 legend title. Expansion of proteome complexity

      Expansion of proteomes by de novo innovations

      Further, changes have been made in the Summary and in the main text to exclude any impression that proteomes/organisms have become more complex with time. Rather we emphasized prokaryote versus eukaryote comparison.

      Comment-2. I think the manuscript would be improved if the authors significantly shortened the discussion of genome size evolution- this is fairly well understood, and could be covered briefly, especially as the main focus of the manuscript is on the evolution of chaperone and co-chaperone repertoire. They could also make clearer quantitative links between protein complexity and the evolution of chaperones and co-chaperones- perhaps this should be in the discussion? The authors might also consider referencing 'The evolution of genome complexity', which could be relevant to this manuscript and might make the work of broader interest.

      Response

      We thank the reviewer for this suggestion. The main focus of our paper is indeed the evolution of chaperones and co-chaperones but within the context of the expansion of proteomes. Having this focus in place, the discussion on proteome size evolution (section: The expansion of proteome size across the Tree of Life) has been revised and shortened to emphasize more on prokaryote versus eukaryotic comparison.

      The suggestion to provide “clearer quantitative links between protein complexity and the evolution of chaperones and co-chaperones” is indeed very useful and we authors sincerely thank the reviewer. To address this suggestion we revised Figure 4 to quantitatively compare the expansion of proteomes and that of chaperones, under one roof. This Figure compares proteome parameters that supposedly demands more chaperone action in all three domains of life and simultaneously summarizes the expansion of the chaperone machinery lacking de novo innovations.

      The first paragraph of the Discussion section has been revised accordingly that walks the reader through the revised Figure 4 and finally introduces to the dichotomy it implies.

      We did not understand the last comment “The authors might also consider referencing 'The evolution of genome complexity', which could be relevant to this manuscript and might make the work of broader interest.” We’d be glad to address it upon further clarification.

      Comment-3. The authors state 'protein trees were generated and compared with ToL to account for gene loss and transfer events'. The methodology for this procedure is not given in the manuscript. The authors should back up this point, and make it clear this is why they reconstruct the trees. Currently it is not convincing to me that the authors have found HGT given the considerable phylogenetic uncertainty in the basal events in the tree of life. I also expect the tree of a single protein to be potentially lack information due to the short sequence considered and possible lack of power. The authors need to consider whether the data is really of high enough quality to assess this.

      Response

      Thank you for this suggestion. For the various chaperone families, we manually compared the protein trees with the Tree of Life. This is clearly stated in the revised Methods section (see Page 25, Lines 31-32). We agree, however, that the identifying HGT, and in general, trees of single domains that are highly diverged, are tricky. We did our best to address these caveats. Specifically:

      We re-evaluated our work in the light of a recent study (PMID: 32316034). This paper discussed the phylogenetic uncertainties associated with molecular dating and re-evaluated the assignment of several protein families to LUCA. A careful analysis revealed that the reviewer is indeed right, meaning many of the HGT events shown in the previous version Figure 3B was indistinguishable from the phylogenetic uncertainties.

      Accordingly, we revised the section “The core-chaperones emerged in early-diverging prokaryotes”. We removed the previous version Figure 3B, along with all instances of HGT events mentioned in the main text, except one (archaea to Firmicute HGT of HSP60, which is well-supported by the data and was also detected previously). Dating the emergence of chaperone families was also re-evaluated. Though the major conclusions were not altered, we discussed the phylogenetic uncertainties associated with our work and the overall confidence of each dating analysis. We believe these discussions would be very useful to the readers.

      Finally, we note that most of our key assignments (points of emergence, and major HGT events) are in agreement with previous works. Specifically: the emergence of HSP20 and HSP60 to LUCA (Sousa et al., 2016; Weiss et al., 2016) and HSP60 being horizontally transferred from archaea to Firmicute (Techtmann and Robb, 2010) and HSP20 being horizontally transferred between bacterial clades and between bacteria and archaea (Kriehuber et al., 2010).

      Comment-4. Methods- the authors could consider taking an alternative source of LUCA proteins, rather than those found in 'Nanoarchaeota and Aquificae': it's possible these are not representative of LUCA, and it seems a somewhat arbitrary choice- the authors could consider using one of the available curated sets, such as that generated by Ranea et al. (2006).

      Response

      The reviewer is right that a more robust LUCA set could be used. However, given that the revised manuscript focuses on comparison across the ToL, and foremost on prokaryote versus eukaryote comparison, we don’t think that refining this set is important. Foremost, this set was used for one purpose only, for determining changes in domain length. And, the set of 38 X-groups used for this analysis are in fact, the ones present in all organisms across the ToL. Hence, we kept the original analysis, while mentioning that these 38 X-groups are conserved across the ToL, and removed the argument for LUCA assignment. See Page 5, Line 22.

      Comment-5. The patterns observed might only hold because of differences in the taxa that diverged pre and post LECA? The authors might consider subgroup analyses to ensure this is not the case. The authors could also consider using methods that take phylogeny into account.

      Response

      The reviewer is right that within prokaryotic domains proteomes do not seem to expand. For example, excluding a few early-diverging prokaryotes and parasites, proteome size in bacteria and archaea varies within 2000-3000 proteins per proteome. Only when pre-LECA and post-LECA organisms are compared, significant differences are observed. We thank the reviewer for this suggestion. We revised the main text to focus on prokaryote versus eukaryote comparison. This re-focusing does not change any of our major conclusions, but rather puts our analysis in the right context (see Comment 1).

      Minor comments

      Comment-6. 'Life's habitability has also expanded from its 10 specific niche of emergence-likely deep-sea hydrothermal vents, to highly variable and extreme 11 ranges of temperature, pressure, exposure to high UV-light, dehydration and free oxygen.' This is not really correct, as bacteria and archaea are found worldwide, and in the most extreme environments.

      Response

      Thank you for this suggestion. We removed the above-mentioned sentence.

      Comment-7. 'We reconciled the topology of our tree'- on first read this was not clear, I did not realise the authors were only building trees for subsets of the data- time tree is the best source for the overall topology. The phrase 'manually curated and adjusted' is used in the methods. This language is much too vague, and not a clear explanation of the steps taken.

      Response

      We apology for this confusion. The overall topology of our Tree of Life is indeed taken for TimeTree. We edited the text in Page 4, Line 4 to clarify this issue.

      The obtained tree topology was manually curated and adjusted to depict eukaryotes stemming from Asgard archaea and Alphaproteobacteria, by an endosymbiosis event. This is clearly mentioned in the Methods section (see Page 22, Lines 24-28).





























      Reviewer #2

      Summary

      Rebeaud and colleagues analyze evolution of chaperones compared to the evolution of whole proteome complexity across the entire tree of life. Their principal conclusions are well captured in the following quote from the Discussion:

      "Comparison of the expansion of proteome complexity versus that of core-chaperones presents a dichotomy-a linear expansion of core-chaperones supported an exponential expansion of proteome complexity. We propose that this dichotomy was reconciled by two features that comprise the hallmark of chaperones: the generalist nature of core-chaperones, and their ability to act in a cooperative mode alongside co-chaperones as an integrated network. Indeed, in contrast to core chaperones, there exist a consistent trend of evolutionary expansion of co-chaperones."

      Major comments

      Comment-1. The general theme of the evolution of proteome management is of obvious interest. Unfortunately, the entire analysis is shaky and fails to convincingly ascertain the authors' conclusions. There are many issues. Throughout the manuscript, the authors discuss 'expansion' of the proteome in bacteria, archaea and eukaryotes, creating the impression of a consistent evolutionary trend. No such trend actually exists if one considers the means or medians of proteome sizes within each of the three domains of life (there is a transition to greater complexity in eukaryotes). The maximum complexity, certainly, increases with time which can be attributed to the 'drunkard's walk' effect. This hardly qualifies as 'expansion'.

      Response

      The reviewer is right that within prokaryotes proteomes do not seem to significantly expand. Reviewer-1 raised a similar concern that prokaryotes and eukaryotes have been evolving for the same period of time and have not expanded significantly. We understand the misconception instated by the earlier version and we thank the reviewers for pointing it out. Accordingly, we revised the main text to clarify these issues, as described in the following.

      Firstly, the main text was revised to emphasize on prokaryote versus eukaryote comparison. The reviewer agrees that compared to prokaryotes, “there is a transition to greater complexity in eukaryotes”. This re-focusing does not change any of our major conclusions, but rather provides a systematic comparison that is adequately supported by data.

      Secondly, we revised the Figures to rename the X-axis as “Order of divergence”, rather than “Divergence time (million years)” in the previous version. We emphasized the fact that the X-axis actually represent the relative order of divergence of the different clades, rather than absolute dates. This emphasis certainly does not create the impression of a consistent evolutionary trend. Instead, combined with the revised main text, it depicts that only when pre-LECA and post-LECA organisms are compared, clear trends of proteome expansion is observed.

      Comment-2. The authors further claim a 'linear' expansion of the chaperone set and 'exponential' expansion of the total proteome size. These are precise mathematical terms and, as such, require fitting to the respective functions. No such thing in this manuscript. Even apart from that shortcoming, the explanation of both 'linear' and 'exponential' are quite confusing. Thus, when explaining the 'linearity' of chaperone evolution, the authors refer to the lack of major innovation among the chaperones. This is correct in itself but has nothing to do with linearity. Apart from the aforementioned conceptual problems, the estimation of the 'exponential' growth of the proteome are naive, inconsistent and inaccurate.

      Response

      Our uses of ‘linear expansion’ versus ‘exponential expansion’ may have been confusing although we have defined quite clearly what we mean by that (i.e., that it is not the mathematical sense). The statement regarding “the lack of major innovation among the chaperones” was made in this context/definition and was consistent with it.

      Nonetheless, to avoid confusion, we revised the main text by excluding the ‘linear expansion’ and ‘exponential expansion’ terms. We simply stated that a torrent of de novo innovations has occurred during the expansion of proteomes from prokaryotes to eukaryotes. In contrast, the evolutionary history of core-chaperones lacks such major innovations. Accordingly, the title of the revised manuscript, figure legend titles, and related section titles have been edited as follows.

      Submitted version

      Revised version

      Paper title. On the evolution of chaperones and co-chaperones and the exponential expansion of proteome complexity

      On the evolution of chaperones and co-chaperones and the expansion of proteomes across the Tree of Life

      Section title. A Tree of Life analysis of the expansion of proteome complexity and chaperones

      A Tree of Life analysis of the expansion of proteomes and chaperones

      Section title. The expansion of proteome complexity

      Proteome expansion by de novo innovations

      Figure 1 legend title. Expansion of proteome size

      Expansion of proteome size across the Tree of Life

      Figure 2 legend title. Expansion of proteome complexity

      Expansion of proteomes by de novo innovations

      Comment-3. As the base point for the expansion estimates for archaea and eukaryotes, the authors take parasitic forms. Even leaving aside the highly dubious claims that these organisms belong to the clades that diverged first from the respective ancestors, parasites are not an appropriate choice for such estimates because they certainly are products of reductive evolution. For bacteria, inconsistently, the authors choose a free-living form from a dubious ancient clade, and not even the one with the smallest genome. All taken together, this robs the expansion estimates of any substantial meaning.

      Response

      This point is overall valid. Although we adamantly reject the insinuation of “dubious claims that these organisms belong to the clades that diverged first from the respective ancestors” – firstly, we did not make any claims to this end, but took the ToL constructed by others (Hedges et al., 2015); second, that these claims are dubious need to backup by counter-evidence/data and with all due respect, neither were provided by the reviewer. However, what is of concern is that in a symbiont/parasite chaperones of the host may have a key role, and thus the comparison to free-living organisms could be misleading. To address this concern we excluded the obligatory endosymbiont Nanoarchaeum equitans and the parasitic organisms from the expansion estimates and such discussions are now limited to free-living organisms only. Further, as described in response to Comment-1, the revised manuscript focuses on prokaryote versus eukaryote comparison.

      Note that phylogenetic analysis often assigns parasitic and symbiotic organisms that have experienced reductive evolution as the earliest diverging clades of their corresponding kingdoms of life. Examples include Nanoarchaeum equitans, an obligate symbiont, assigned as the earliest diverging archaea (Hedges et al., 2015; Huber et al., 2002; Waters et al., 2003), and parasitic Excavate assigned as one of the earliest diverging eukaryotes (Burki et al., 2020; Simpson et al., 2002). In accordance with these studies, these parasitic and symbiotic organisms were included in our analysis. We acknowledged this fact in the Methods section (see Page 22, Lines 9-16).

      Comment-4. The authors do make a salient and I think essentially correct observation: chaperones typically comprise about 0.3% of the proteins in any organism. As such, this presents no dichotomy in evolutionary trends to be explained. Surely, as examined and discussed in the paper, eukaryotes also show significant increases in the size and domain content of the encoded proteins, suggesting the possibility that might need more chaperones. However, if this is the explanandum, rather than the number of proteins in the proteome as such, it should be clearly stated. Furthermore, it is quite natural to assume that this increase in protein complexity without a commensurate increase in the chaperone diversity, is enabled by higher expression of the chaperones as suggested in the Discussion of this paper. I doubt there is any big surprise here and even much need for an extended discussion let alone a special publication.

      Response

      As emphasized, and shown, eukaryotes have not only larger proteomes in terms of the number of proteins or protein size. They have a higher content of proteins that are prone to misfolding. This is shown explicitly, in Figure 2 (namely, multidomain proteins, repeat, beta-rich proteins, etc’) and is reiterated in a summary figure (suggested by Reviewer 1). Further, in response to Reviewer-3’s suggestion, we showed that eukaryotes feature much higher proportions of aggregation-prone proteins per proteome than prokaryotes (Figure 2E).

      To further clarify, we revised Figure 4 to quantitatively compare the expansion of proteomes and that of chaperones, under one roof. This Figure compares proteome parameters that supposedly demands more chaperone action in all three domains of life and simultaneously summarizes the expansion of the chaperone machinery lacking de novo innovations.

      In addition, the first paragraph of this Discussions section is revised to state that from prokaryotes to eukaryotes, proteomes have expanded by duplication-divergence as well as by innovations (de novo emergence of new folds). Thus, it’s not about the size only (a challenge that a proportion expansion of chaperone genes would resolve, i.e., the 0.3%) but about proteome composition changing in a way that demands more and more chaperone action.

      We also agree with the assertion that “it is quite natural to assume that this increase in protein complexity without a commensurate increase in the chaperone diversity, is enabled by higher expression of the chaperones”. However, we belong to a group of scientists for whom natural assumptions are insufficient, and think that supporting evidence is of importance.

      Reviewer’s significance statement

      As such, in the opinion of this reviewer, there is no substantial advance over the existing knowledge in this paper. Should the authors wish to revise, they would need to develop robust methodology to measure proteome expansion. That would involve starting from reconstructed ancestors rather than any extant forms (let alone parasites). I doubt that such analysis, non-trivial in itself, reveals an strong, consistent trends other than the well known increase in complexity in eukaryotes.

      Response

      We agree that to assert evolutionary, time-dependent trends one needs to analyze phylogenies and reconstructed ancestors, but still think that a comparison of proteome and chaperone contents along the Tree of Life is meaningful. We thus respectfully, yet adamantly disagree with “no substantial advance over the existing knowledge”. We strongly believe, as does Reviewer-3, that the results and the model presented in this paper are “fascinating to consider and… will stimulate a good deal of important discussion…”.

      Reviewer #3

      Summary

      The manuscript by Rebeaud et al describes phylogenetic analyses of proteome and chaperone complexity. The authors analyzed species across the tree of life to predict the proteome and chaperone properties of ancestors spanning to the last universal common ancestor. Their analyses indicate that many proteome properties increased in complexity over evolutionary time including: average protein length, the number of multi-domain proteins, the size of the proteome, the number of repeat proteins, and the number of beta-superfold proteins that are known to be difficult to fold. Their analyses also indicate an expansion in chaperone families that corresponds to the increase in proteome complexity. Based on their analyses, the authors propose a model where early life relied on a limited number of chaperones (Hsp20 and Hsp60) and that as proteome complexity evolved, so did chaperone complexity. Core chaperones including Hsp90, Hsp70, and Hsp100 evolved relatively early, and later chaperone evolution was driven by the appearance and alterations of co-chaperones and auxiliary factors as well as by increases in the protein abundance of chaperones.

      Major concerns

      Comment-1. This work is appropriately based on phylogenetic inferences, but as such, the limitations and uncertainties of phylogenetic inferences need to be discussed. This in no way takes away from the work, quite the opposite, it would make it richer by encouraging broader interpretations where justified and clear understanding of where support for the model is strongest. Posterior probabilities need to be discussed and the range of properties that a likely ancestor might have based on the data should be discussed. How this impacts the conclusions and models should be discussed. Throughout the manuscript, the authors present most-likely ancestral models (as I understood it), what are the next most likely models? How much power is there to distinguish one model from another? It would be very helpful to have a section describing the limitations and uncertainties of the phylogenetic analyses and how these relate to the main findings and conclusions.

      Response

      We thank the reviewer for this suggestion. Reviewer-1 raised a similar suggestion (see Comment-3). The phylogenetic analysis in our paper included dating the emergence of core- and co-chaperone families, and attempt to infer major their HGT events, foremost in relation to the origin of eukaryotic chaperones. To highlight the uncertainties of phylogenetic inferences we re-evaluated our work in the light of a recent study (PMID: 32316034) that carefully analyzed the uncertainties associated with the assignment of several protein families to LUCA.

      Ideally, for a protein family to be assigned to LUCA, there must be a single split of bacterial and archaeal domains at the root of the protein tree with strong bootstrap support, and the inter-domain branches would be longer than the intra-domain branches (PMID: 32316034). In the revised main text we discussed that only the HSP60 protein tree satisfies this criterion. HSP20 protein tree depicts a clear single split of bacterial and archaeal domains at the root, albeit with weak bootstrap support, and inter-domain branch lengths are smaller than intra-domain branch-lengths. We discussed that this is indeed the case of phylogenetic uncertainty, which means the sequence of this small, single-domain chaperone lacks the information to make reliable inference at the basal events in the ToL.

      In addition, the HGT events discussed in the previous version appear to be indistinguishable from phylogenetic uncertainties and we removed all instances of HGT events mentioned in the main text as well as Figure 3B. Only one HGT event – HSP60 being horizontally transferred from archaea to Firmicute, which is well-supported by the data is kept in the revised main text. We believe these discussions would be very useful to the readers.

      Finally, we note that most of our key assignments (points of emergence, and major HGT events) are in agreement with previous works. Specifically: the emergence of HSP20 and HSP60 to LUCA (Sousa et al., 2016; Weiss et al., 2016) and HSP60 being horizontally transferred from archaea to Firmicute (Techtmann and Robb, 2010) and HSP20 being horizontally transferred between bacterial clades and between bacteria and archaea (Kriehuber et al., 2010).

      Comment-2. General features that impact foldability, including contact order, should be discussed and what features can be searched for in genomes that relate to these - e.g. beta-rich proteins.

      Response

      Thanks for this valuable idea! Contact order, and other predictors of problematic folding are highly relevant but their analysis is structure-based and hence inapplicable on the proteome (sequence) scale. We did, hwoever, estimate the proportion of aggregation-prone proteins in the proteome. These proteins were identified by CamSol method that assigns poorly soluble regions from sequence data. Indeed, some of these predicted ‘poorly soluble segments’ refer to the hydrophobic core of the respective folded state instead of ‘true’ aggregation hotspots. With this unavoidable potential caveat, it appears that compared to prokaryotes, aggregation-prone proteins in the proteome have become nearly 6-fold more frequent in Chordates.

      Following changes were made to accommodate this new analysis:

      Figure 2 is revised to include a new panel (panel-E) that shows the expansion of aggregation-prone proteins in the proteome across the Tree of Life. The same result is summarized in the summary Figure 4.

      A new paragraph entitled “Proteins predicted as aggregation-prone became ~6-fold more frequent in the proteome” is added to the Results section, which describes the principle and the main results (see Page 7, Lines 14-28).

      The methodology is included in the Methods section, in a paragraph entitled “Predicted proportion of aggregation-prone proteins in the proteome”, see Page 24 Lines 17-27. For each representative organism, the percent of aggregation-prone proteins in proteome data are provided as Data S10.

      This analysis is also included in the revised Abstract: “Proteins prone to misfolding and aggregation, such as repeat and beta-rich proteins, proliferated ~600-fold, and accordingly, proteins predicted as aggregation-prone became 6-fold more frequent in mammalian compared to bacterial proteomes.” See Page 2, Lines 7-9.

      Comment-3. "Core" chaperones needs to be defined.

      Response

      Thank you for this suggestion. We restructured Page 3 Lines 19-23 in the Introduction to clearly explain this aspect. The current text is quoted below.

      “Chaperones can be broadly divided into core- and co-chaperones. Core-chaperones can function on their own, and include ATPases HSP60, HSP70, HSP100, and HSP90 and the ATP-independent HSP20. The basal protein holding, unfolding, and refolding activities of the core-chaperones are facilitated and modulated by a range of co-chaperones such as J-domain proteins (Caplan, 2003; Duncan et al., 2015; Schopf et al., 2017).”

      Minor concerns and thoughts

      Comment-4. This manuscript stimulated me to think about the dynamics between chaperone evolution and proteome evolution. The ability to tolerate proteins that need chaperones seems linked to major evolutionary innovations. Once you have these innovations though, you are addicted to the chaperones - and an expansion of the number of sub-optimal proteins. These ideas seem like they would be valuable to include in the discussion of this work. More generally, it would be wonderful to have a discussion of future directions that this work may spark.

      Response

      This is indeed a fascinating question or set of questions, that we have also become intrigued about following this work, We introduced a short section, though more of an ‘appetizer’ than a detailed discussion, as we know almost nothing about the co-evolution of new proteins and chaperones.

      Reviewer’s significance statement

      This manuscript provides a fascinating glimpse back in time of a fundamental interplay - between chaperone evolution/addiction and proteome evolution. I am not an expert in phylogenetic analyses so I cannot judge the details of the analyses. As an expert in molecular evolution and chaperones, I found the approach and model fascinating to consider and I believe it will stimulate a good deal of important discussion in these fields. I have one major concern that I feel ought to be addressed in the manuscript and a number of points that I would encourage the authors to consider. I am sure that these can be readily addressed and I look forward to seeing this work published and the further discussion and ideas that it may stimulate.

      Response

      Thank you!

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      The manuscript by Rebeaud et al describes phylogenetic analyses of proteome and chaperone complexity. The authors analyzed species across the tree of life to predict the proteome and chaperone properties of ancestors spanning to the last universal common ancestor. Their analyses indicate that many proteome properties increased in complexity over evolutionary time including: average protein length, the number of multi-domain proteins, the size of the proteome, the number of repeat proteins, and the number of beta-superfold proteins that are known to be difficult to fold. Their analyses also indicate an expansion in chaperone families that corresponds to the increase in proteome complexity. Based on their analyses, the authors propose a model where early life relied on a limited number of chaperones (Hsp20 and Hsp60) and that as proteome complexity evolved, so did chaperone complexity. Core chaperones including Hsp90, Hsp70, and Hsp100 evolved relatively early, and later chaperone evolution was driven by the appearance and alterations of co-chaperones and auxiliary factors as well as by increases in the protein abundance of chaperones.

      Major concerns:

      1. This work is appropriately based on phylogenetic inferences, but as such, the limitations and uncertainties of phylogenetic inferences need to be discussed. This in no way takes away from the work, quite the opposite, it would make it richer by encouraging broader interpretations where justified and clear understanding of where support for the model is strongest. Posterior probabilities need to be discussed and the range of properties that a likely ancestor might have based on the data should be discussed. How this impacts the conclusions and models should be discussed. Throughout the manuscript, the authors present most-likely ancestral models (as I understood it), what are the next most likely models? How much power is there to distinguish one model from another? It would be very helpful to have a section describing the limitations and uncertainties of the phylogenetic analyses and how these relate to the main findings and conclusions.
      2. General features that impact foldability, including contact order, should be discussed and what features can be searched for in genomes that relate to these - e.g. beta-rich proteins.
      3. "Core" chaperones needs to be defined.

      Minor concerns and thoughts:

      1. This manuscript stimulated me to think about the dynamics between chaperone evolution and proteome evolution. The ability to tolerate proteins that need chaperones seems linked to major evolutionary innovations. Once you have these innovations though, you are addicted to the chaperones - and an expansion of the number of sub-optimal proteins. These ideas seem like they would be valuable to include in the discussion of this work. More generally, it would be wonderful to have a discussion of future directions that this work may spark.

      Significance

      This manuscript provides a fascinating glimpse back in time of a fundamental interplay - between chaperone evolution/addiction and proteome evolution. I am not an expert in phylogenetic analyses so I cannot judge the details of the analyses. As an expert in molecular evolution and chaperones, I found the approach and model fascinating to consider and I believe it will stimulate a good deal of important discussion in these fields. I have one major concern that I feel ought to be addressed in the manuscript and a number of points that I would encourage the authors to consider. I am sure that these can be readily addressed and I look forward to seeing this work published and the further discussion and ideas that it may stimulate.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Rebeaud and colleagues analyze evolution of chaperones compared to the evolution of whole proteome complexity across the entire tree of life. Their principal conclusions are well captured in the following quote from the Discussion:

      "Comparison of the expansion of proteome complexity versus that of core-chaperones presents a dichotomy-a linear expansion of core-chaperones supported an exponential expansion of proteome complexity. We propose that this dichotomy was reconciled by two features that comprise the hallmark of chaperones:the generalist nature of core-chaperones,and their ability to act in a cooperative mode alongside co-chaperones as an integrated network.Indeed, in contrast to core chaperones, there exist a consistent trend of evolutionary expansion of co-chaperones."

      The general theme of the evolution of proteome management is of obvious interest. Unfortunately, the entire analysis is shaky and fails to convincingly ascertain the authors' conclusions. There are many issues. Throughout the manuscript, the authors discuss 'expansion' of the proteome in bacteria, archaea and eukaryotes, creating the impression of a consistent evolutionary trend. No such trend actually exists if one considers the means or medians of proteome sizes within each of the three domains of life (there is a transition to greater complexity in eukaryotes). The maximum complexity, certainly, increases with time which can be attributed to the 'drunkard's walk' effect. This hardly qualifies as 'expansion'. The authors further claim a 'linear' expansion of the chaperone set and and 'exponential' expansion of the total proteome size. These are precise mathematical terms and, as such, require fitting to the respective functions. No such thing in this manuscript. Even apart from that shortcoming, the explanation of both 'linear' and 'exponential' are quite confusing. Thus, when explaining the 'linearity' of chaperone evolution, the authors refer to the lack of major innovation among the chaperones. This is correct in itself but has nothing to do with linearity. Apart from the aforementioned conceptual problems, the estimation of the 'exponential' growth of the proteome are naive, inconsistent and inaccurate. As the base point for the expansion estimates for archaea and eukaryotes, the authors take parasitic forms. Even leaving aside the highly dubious claims that these organisms belong to the clades that diverged first from the respective ancestors, parasites are not an appropriate choice for such estimates because they certainly are products of reductive evolution. For bacteria, inconsistently, the authors choose a free-living form from a dubious ancient clade, and not even the one with the smallest genome. All taken together, this robs the expansion estimates of any substantial meaning.

      The authors do make a salient and I think essentially correct observation: chaperones typically comprise about 0.3% of the proteins in any organism. As such, this presents no dichotomy in evolutionary trends to be explained. Surely, as examined and discussed in the paper, eukaryotes also show significant increases in the size and domain content of the encoded proteins, suggesting the possibility that might need more chaperones. However, if this is the explanandum, rather than the number of proteins in the proteome as such, it should be clearly stated. Furthermore, it is quite natural to assume that this increase in protein complexity without a commensurate increase in the chaperone diversity, is enabled by higher expression of the chaperones as suggested in the Discussion of this paper. I doubt there is any big surprise here and even much need for an extended discussion let alone a special publication.

      Significance

      As such, in the opinion of this reviewer, there is no substantial advance over the existing knowledge in this paper. Should the authors wish to revise, they would need to develop robust methodology to measure proteome expansion. That would involve starting from reconstructed ancestors rather than any extant forms (let alone parasites). I doubt that such analysis, non-trivial in itself, reveals an strong, consistent trends other than the well known increase in complexity in eukaryotes.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary:

      The authors present well written work on the evolution of proteome size and complexity, and the corresponding changes in chaperone proteins. Interestingly, they find chaperone copy numbers increase linearly with proteome size, despite the increasing 'complexity' of, in particular, post-LECA genomes. They suggest that to address the rise in complexity, organisms express chaperones at higher levels and an expanding network of co-chaperones has evolved across the tree of life.

      Major comments:

      -Summary reads strangely relative to the rest of the manuscript, and lists facts in a way that makes the purpose of the study confusing. I think most readers will dislike the characterisation of evolution as a progress from simple to complex, and the authors' might want to avoid this language throughout the manuscript- bacteria and archaea have also been evolving over this period of times, and have not become more 'complex'? Similarly the authors should reconsider their figure legend titles. As a specific example,'in the course of evolution' should become 'across the tree of life' .

      -I think the manuscript would be improved if the authors significantly shortened the discussion of genome size evolution- this is fairly well understood, and could be covered briefly, especially as the main focus of the manuscript is on the evolution of chaperone and co-chaperone repertoire. They could also make clearer quantitative links between protein complexity and the evolution of chaperones and co-chaperones- perhaps this should be in the discussion? The authors might also consider referencing 'The evolution of genome complexity', which could be relevant to this manuscript and might make the work of broader interest.

      -The authors state 'protein trees were generated and compared with ToL to account for gene loss and transfer events'. The methodology for this procedure is not given in the manuscript. The authors should back up this point, and make it clear this is why they reconstruct the trees. Currently it is not convincing to me that the authors have found HGT given the considerable phylogenetic uncertainty in the basal events in the tree of life. I also expect the tree of a single protein to be potentially lack information due to the short sequence considered and possible lack of power. The authors need to consider whether the data is really of high enough quality to assess this.

      -Methods- the authors could consider taking an alternative source of LUCA proteins, rather than those found in 'Nanoarchaeota and Aquificae':it's possible these are not representative of LUCA, and it seems a somewhat arbitrary choice- the authors could consider using one of the available curated sets, such as that generated by Ranea et al. (2006)

      -The patterns observed might only hold because of differences in the taxa that diverged pre and post LECA? The authors might consider subgroup analyses to ensure this is not the case. The authors could also consider using methods that take phylogeny into account.

      Minor comments:

      'Life's habitability has also expanded from its 10 specific niche of emergence-likely deep-sea hydrothermal vents, to highly variable and extreme 11 ranges of temperature, pressure, exposure to high UV-light, dehydration and free oxygen.' This is not really correct, as bacteria and archaea are found worldwide, and in the most extreme environments.

      ' We reconciled the topology of our tree'- on first read this was not clear, I did not realise the authors were only building trees for subsets of the data- time tree is the best source for the overall topology. The phrase 'manually curated and adjusted' is used in the methods. This language is much too vague, and not a clear explanation of the steps taken.

      Significance

      The work presents interesting results that suggest that more 'complex' organisms have evolved a strategy to cope with increasing proteome size, and is interesting to researchers in the field of molecular evolution.

      I am a researcher in population genetics and molecular evolution.

    1. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      This study outlines calcium probes for assessing the poorly understood role of peroxisomes in calcium signaling. The authors suggest that these organelles sequester calcium from either calcium influx across the plasma membrane or from release from the ER/SR. This is important since we need to know more about the roles of these organelles in calcium homeostasis and signaling. However, it needs to be robustly demonstrated that the probes are targeted to the right organelle without confounding contamination from other organelles which can be very significant even for a small degree of mis-targeting.

      Major

      1. The difference between the signals seen between the peroxisome and cytosolic D3 versions are not compelling, other than a dampened spike with the former (higher resting levels, smaller peak). See below for pH concerns.
      2. How clean is the peroxisome distribution? Prove that D3 spillover from its being partially in (or on) other compartments (e.g. cyto, ER) is not contributing to the changes. Selective manipulation of Ca2+ in these other compartments should not affect the peroxisome signal.
        • a. For example, the small changes in the D3-px could be explained by peroxisome not changing at all but rather the other compartments (where larger responses are observed) signal(s) contaminating the response.
          • b. e.g. if in the ER lumen, the signal should be eliminated with SERCA inhibitors (thapsigargin, CPA). They used Thapsigargin in cardiac myocytes, why not in HeLa during characterization)?
      3. Any Ca2+ reporter will pH-sensitive to an extent, even D3 (Ca2+ binding, inherent fluorescent proteins).
        • a. It is essential to prove that the signal changes are not due changes perox pH. Target pH-sensitive proteins to the perox lumen by the same strategy and show that the same Ca2+ interventions do not cause pH changes.
        • b. The authors claim different resting levels of [Ca2+] in cytosol/mitochondria/peroxisome. The resting FRET level also depends on the resting pH of the compartments which may also be different. Certainly, mitochondria are more alkaline than the cytosol. Again, to interpret these are real Ca2+ differences requires the pH to be accounted for.
      4. I am puzzled by the model, in particular in view of Fig 3. The genetically-encoded calcium indicator (GECI) is allegedly in on the cytosolic face of the peroxisome and measuring peri-peroxisomal Ca2+.
        • a. The changes with this reporter look pretty similar to the luminal reporter (save that the resting ratio may be lower). I don't understand how the lumen [Ca2+] > cytosolic [Ca2+] without a higher local [Ca2+] (unless there is an energy-driven uptake mechanism, but then how does this fit in with ER-driven Ca2+ release?).
      5. The claim that resting peroxisome [Ca2+] is higher than cytosol is questionable. Is this a calibration artifact (e.g. compartment pH-differences or the reporter behaves differently in the lumen)? Such a gradient could not be sustained without energy-dependent Ca2+ uptake. The authors make no discussion of this.

      Minor

      1. Quantitate localization. Pearson's coefficients for GECIs and Peroxisomes.
      2. Different upstroke rates of D3 with His vs Cao. Quantify.
      3. Page 5. Line 161. 'Different sites', do the authors mean different sides? Similarly, the Legend of Fig 3.

      Significance

      Good peroxisome calcium probes is important to the genral calcium signaling field. This is fundamental science of interst to all cell biologists.

      There has been little published on peroxisome calcium, although for example, the Pozzan lab published a paper in JBC in 2008 on a GFP-based lumenally targeted peroxisome probe. There is contradictory data in the field and reliable new approaches are needed.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      The manuscript by Sargsyan et al describes an unappreciated role for peroxisomes in Calcium dynamics. Specifically, the authors propose that GPCR/VDCC/SOCE-mediated cytosolic Ca2+ elevation is rapidly sensed by peroxisomes and sequestered. The authors used/generated a peroxisome-targeted genetically encoded Ca2+ indicators which is elegant and powerful tool to monitor the luminal Ca2+ dynamics. While the results and conclusions are novel, there are some important gaps that need to be addressed for consideration for publication in EMBO J.

      Comments:

      Peroxisomes are single membrane bound organelles which are conserved across species spanning from yeast to humans. While housing only -100 proteins, they are responsible for essential steps in lipid metabolism, amino acid metabolism and ROS homeostasis. Unlike other organelles, peroxisomes import fully folded and cofactor-bound proteins into their matrix. Though peroxisomes house specific metabolic functions, there is extensive crosstalk with other organelles, including mitochondria. It is essential to test and define whether silencing/knockdown of mitochondrial Ca2+ transport components like MCU will impact peroxisome Ca2+ uptake upon stimulation with histamine or electrical stimulation.

      Since peroxisomes buffer significant amount of Ca2+, it is worth testing whether blockade of mitochondrial Ca2+ uptake would not alter peroxisome mediated Ca2+ influx. This analysis will provide Ca2+ uptake rate of mitochondria vs peroxisomes (mallilankaraman K. et al CELL 2012 and Nemani N. et al Science Signaling 2020).

      Peroxisomal synthesis of plasmalogens is Ca2+ and oxygen tension dependent, it is essential to show that altering Ca2+ controls plasmalogen synthesis.

      In the introduction authors have stated that "Elevated mitochondrial uptake increases 39 mitochondrial reactive oxygen species (ROS) production and is associated with heart falure and ischemic 40 brain injury (Starkov et al., 2004; Santulli et al., 2015)." These cited articles remotely links MCU and ROS elevation. It is important to point out that Tomar et al 2016 Cell Reports clearly demonstrated that genetic ablation of MCU suppresses mROS production that is mitochondrial Ca2+ dependent.

      Significance

      The significance of the work is very high. The authors employ a variety of complementary techniques and experimental systems to demonstrate that peroxisomes indeed buffer a large quantity of Ca2+ upon stimulation.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      These are straight forward studies aimed to develop probes to asses peroxisomal Ca2+ in rest and in response to receptor stimulation. The probes were designed to measure intraperoxisomal Ca2+ and the Ca2+ the peroxisome experience when cytoplasmic Ca2+ is increased. The pobes fill a need in understanding peroxisomal Ca2+ and Ca2+ signaling in general and should be very useful to investigators in the field.

      The comments are aimed to help in improving the studies and taking them to the next stage.

      The grammar needs improvement and the introduction needs sharpening. It is long and, in many places, not to the point. The results and discussion sections are also quite verbose.

      The sidedness of the probes need to be validated further, especially since the peroxisomal Ca2+ increase follows the cytoplasmic and the slower reduction rate may results from the environment experienced by the probe. Simple experiments: how the probes respond to Ca2+ ionophore; does Ca2+ reduced rapidly when removed from the media of the digitonin permeabilized cells; how the cytoplasmic and peroxisomal thapsigargin responses compare using the protocols in 2A and 4A? Sidedness of PEX13-D3cpV was not examined.

      Calculation of peroxisomal Ca2+ are based on Kd reported in the literature. The Kds of D3cpV-px and PEX13-D3cpV should be determined when in the peroxisome in permeabilized cells for the numbers to have any meaning.

      How the localization of the probes look in the differentiated cardiomyocytes? How it compares to RyRs, VACC, etc..

      The major weakness of the study is that the probes are used only as a tool. The enhance the study and bring it beyond an excellent technical achievement, the authors should use them to study a significant Ca2+-dependent peroxisomal function and show how the use of the tools eliminate the role of Ca2+ in such a function.

      Significance

      These are straight forward studies aimed to develop probes to asses peroxisomal Ca2+ in rest and in response to receptor stimulation. The probes were designed to measure intraperoxisomal Ca2+ and the Ca2+ the peroxisome experience when cytoplasmic Ca2+ is increased. The pobes fill a need in understanding peroxisomal Ca2+ and Ca2+ signaling in general and should be very useful to investigators in the field.

      The major weakness of the study is that the probes are used only as a tool. The enhance the study and bring it beyond an excellent technical achievement, the authors should use them to study a significant Ca2+-dependent peroxisomal function and show how the use of the tools eliminate the role of Ca2+ in such a function.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1 (Evidence, reproducibility and clarity (Required)): **Summary:** This interesting study by Putker et al. showed that circadian rhythmicity persists in several typical circadian assay systems lacking Cry, including Cry knockout mouse behavior and gene expression in Cry knockout fibroblasts. They further demonstrated weak but significant circadian rhythmicity in Cry- and Per- knockout cells. Cry- (and potentially Per-)-independent oscillations are temperature compensated, and CKId/e still has a role in the period regulation of Cry-independent oscillations. **Major comments:** 1) The authors propose that the essential role of mammalian Cryptochrome is to bring the robust oscillation. As the authors analyze in many parts, the robustness of oscillation can be validated by the (relative) amplitude and phase/period variation, both of which should be affected significantly by the method for cell synchronization. Unfortunately, the method for synchronization is not adequately written in this version of supplementary information. This reviewer has no objection to the "iterative refinement of the synchronization protocol" but at least the correspondence between which methods were used in which experiments needs to be clearly explained. The detailed method may be found in the thesis of Dr. Wong, but the methods used in this manuscript need to be detailed within this manuscript.

      We thank the reviewer for recognising the importance of different synchronisation protocols. In experiments where bioluminescent CKO rhythms were observed, different synchronisation protocols resulted in similar results when comparing WT with CKO cells. The different synchronisation methods used in each experiment are now specified in the supplementary methods.

      2) The authors revealed that CKO mice have apparent behavioral rhythmicity under the condition of LL>DD. This is an intriguing finding. However, it should be carefully evaluated whether this rhythmicity (16 hr cycle) is the direct consequence of circadian rhythmicity observed in CKO and CPKO cells (24 hr cycle) because the period length is much different. Is it possible to induce the 16 hr periodicity in CKO mice behavior by 16 hr-L:16 hr-D cycle? Would it be a plausible another possibility that the 16 hr rhythmicity is the mice version of internal desynchronization or another type of methamphetamine-induced-oscillation/food-entrainable-oscillattion?

      The reviewer makes an excellent suggestion. As described in the manuscript text (page 13), CKO mice have already been shown to entrain to restricted feeding cycles (Iijima et al., 2005) and we therefore assessed whether CKO rhythms would entrain to a 16h day as suggested. Whilst CKO (but not WT) mice showed 16h behavioural rhythms during entrainment, they were arrhythmic under constant darkness thereafter (Revised Figure S2A). CKO cellular rhythms show reduced robustness under constant conditions ex vivo, and our other work has revealed that CRY-deficiency renders cells much more susceptible to stress (Wong et al, 2020, BioRxiv). The parsimonious explanation, therefore, is that whilst the cellular timing mechanism remains functional when CRY is absent, the amplitude of cellular clock outputs is severely attenuated (as we showed previously in Hoyle et al., Sci Trans Med, 2017) in a fashion that impairs the fidelity of intercellular synchronisation under most conditions in vivo, as well as the molecular mechanisms of entrainment to light-dark cycles.

      With respect to the apparent discrepancy between mean periods of CKO cultured cells (~21h), SCN (~19h) and mice (~17h). This is also observed in WT cells (~26h), SCN (~25h) and mice (~24h), simply with a smaller effect size and longer intrinsic period.

      We believe this difference in effect size can adequately be explained by differences in oscillator coupling, combined with the reduced robustness of CKO timekeeping. In Figure 1F we show that the range of rhythmic periods expressed by cultured CKO fibroblasts (14-30h) is much greater than for their WT counterparts (range of 22-26h), or that which is observed when cellular oscillators are coupled in CKO SCN (19h). Thus period of CKO oscillations is demonstrably more plastic (less robust) than WT, and with a cell-intrinsic tendency towards shorter period which is revealed more clearly when oscillators are coupled.

      In vivo there is more oscillator coupling in the intact SCN than in an isolated slice, from which communication with the caudal and rostral hypothalamus has been removed. Thus it seems plausible that increased coupling in vivo, combined with positive feedback via behavioural cycles of feeding and locomotor activity, resonate with a common frequency which is shorter than in isolated tissue.

      Critically, for both WT and CKO mice/SCN, the circadian period lies within the range of periods observed in isolated fibroblasts. To communicate this rather nuanced point we have inserted the following text into the supplementary discussion:

      “Circadian timekeeping is a cellular phenomenon. Co-ordinated ~24h rhythms in behaviour and physiology are observed in multi-cellular mammals under non-stressed conditions when individual cellular rhythms are synchronised and amplified by appropriate extrinsic and intrinsic timing cues. In light of short period (~16.5h) locomotor rhythms observed in CKO mice after transition from constant light to constant dark, but failure to entrain to 12h:12h light:dark cycles, it seemed plausible that either CKO mice might entrain to an short 8h:8h light:dark (16h day) or else have a general deficiency to entrainment by light:dark cycles. The data in Figure S2 supports the latter possibility, in that neither WT nor CKO mice stably entrained to 16h cycles whereas WT but not CKO mice entrained to 24h days. The bioluminescence oscillations observed in CKO cells conform to the long-established definition of a circadian rhythm (temperature-compensated ~24h period of oscillation with appropriate phase-response to relevant environmental stimuli). Whereas the locomotor rhythms observed in CKO mice under quite specific environmental conditions correlates with both the cellular and SCN data to suggest the persistence of capacity to maintain behavioural rhythms close to the circadian range, but which is masked under most circumstances. We suggest that in vivo the (pathophysiological) stress of CRY-deficiency is epistatic to the expression of daily rhythms in locomotor activity following standard entrainment by light:dark cycles and thus, whilst not arrhythmic, also cannot be described as circadian in the strictest sense.”

      3) The authors proposed that CKId/e at least in part is the component of cytoscillator (Fig. 5D), and turnover control of PER (likely to be controlled by CKId/e) may be an interaction point between cytoscillator and canonical circadian TTFL (Fig. 4). Strictly speaking, this model is not directly supported by the experimental setting of the current manuscript. The contribution of CKId/e is evaluated in the presence of PER by monitoring the canonical TTFL output (i.e. PER2::LUC); thus it is not clear whether the kinase determines the period of cytoscillator. It would be valuable to ask whether the PF and CHIR have the period-lengthening effect on the Nrd1:LUC in the CPKO cell.

      Another excellent suggestion, thanks. The experiment, showing similar results in CKO and CPKO cells, was performed and is now reported in Revised Figure S5D. The text was amended as follows: “We found that inhibition of CK1d/e and GSK3-α/β had the same effect on circadian period in CKO cells, CPKO cells, and WT controls (Figure 5A, B, S5A, B, D).”

      Moreover, our data are further supported by findings in RBCs, where CK1 inhibition affects circadian period in a similar manner as in WT and CKO cells (Beale et al, JBR 2019).

      **Minor comments:**

      4) The authors argue that the CKO cells' rhythmicity is entrained by the temperature cycle (Fig. 2C). Because the data of CKO cell only shows one peak after the release of constant temperature phase, it is difficult to conclude whether the cell is entrained or just respond to the final temperature shift.

      We agree with the reviewer and have replaced the original figure with another recording that includes an extra circadian cycle in free-running conditions (Revised Figure 2C).

      5) It would be useful for readers to provide information on the known phenotype of TIMELESS knockout flies; TIM is widely accepted as an essential component of the circadian clock in flies; are there any studies showing the presence of circadian rhythmicity in Tim-knockout flies (even if it is an oscillation seen in limited conditions, such as the neonatal SCN rhythm in mammalian Cry knockout)?

      The reviewer is correct that TIM is widely accepted as an essential component of the circadian clock in flies. Using more sensitive modern techniques however, ~50% of classic Tim01 mutant flies exhibit significant behavioural rhythms in the circadian range under constant darkness, as reported:

      https://opus.bibliothek.uni-wuerzburg.de/frontdoor/index/index/year/2015/docId/11914

      For this reason we employed a full gene knockout of the Timeless gene (Lamaze et al., Sci Rep, 2017), where the majority of flies are behaviourally arrhythmic under constant conditions following standard entrainment by light cycles and therefore represents a more appropriate model for CRY-deficient cells.

      We have revised the legend of Figure S2 to include the following:

      “N.B. The generation of Timout flies is reported in Lamaze et al, Sci Rep, 2017. Similar to CRY-deficient mice, whole gene Timeless knockout flies are characterised as being behaviourally arrhythmic under constant darkness following entrainment by light:dark cycles: https://opus.bibliothek.uni-wuerzburg.de/frontdoor/index/index/year/2015/docId/11914”

      5) Figure 3C shows that the amount of PER2::LUC mRNA changes ~2 fold between time = 0 hr and 24 hr in the CKO cell. This amplitude is similar to that observed in WT cell although the peak phase is different. Does the PER2::LUC mRNA level show the oscillation in CKO cells?

      No, we think we have shown convincingly this is not the case. We argue the data in figure 3C show that: (a) there is no circadian variation in mRNA PER2::LUC expression (mRNA levels increase but no trough is observed) and (b) that the temporal relationship between protein and mRNA as observed in WT is broken; i.e. the CRY-independent circadian variation in protein levels cannot be “driven by” changes in transcript levels. Similar results were obtained using transcriptional reporters Per2:LUC and Cry1:LUC (Figure S3E and F). Moreover, our findings are also in line with previous reports, such as Nangle et al. (2014, eLife) and Ode et al. (Mol Cell, 2017).

      6) Figure 3D: the authors discuss the amplitude and variation (whether the signal is noisier or not) of reporter luciferase expression between different cell lines. However, a huge difference in the luciferase signal can be observed even in the detrended bioluminescence plot. This reviewer concerns that some of the phenotypes of CKO and CPKO MEF reflect the lower transfection efficiency of the reporter gene, not the nature of circadian oscillators of these cell lines.

      As reported in the methods, these are stable cell lines rather than transiently transfected cells. The detrended luciferase data presented here do not actually reflect raw levels of luciferase protein expression, but rather reflect the amount of deviation from the 24 hour average. To make it easier to compare expression levels of Per2:LUC and Nr1d1:LUC between the different cell lines we have added figure S3H, presenting the average raw bioluminescence levels over 24 hours (after 24 hours of recovery from media change; ie from 24-48 hours). Using these data one can appreciate that expression levels of the Per2 reporter are never lower in CRY KO cells when compared to WT. We hope these data can take away the reviewer’s concerns about expression levels causing the differences observed.

      Reviewer #1 (Significance (Required)): Although Cryptochrome (Cry) has been considered a central component of the mammalian circadian clock, several studies have shown that circadian rhythms are maintained in the absence of Cry, including in the neonate SCN and red blood cells. Thus, although the need for Cry as a circadian oscillator has been debated, its essential role as a circadian oscillator remains established, at least in the cell-autonomous clock driven by the TTFL. This study provides additional evidence that the circadian rhythmicity can persist in the absence of Cry. More general context, the presence of a non-TTFL circadian oscillator has been one of the major topics in the field of circadian clocks except for the cyanobacteria. In mammals, the authors’ and other groups lead the finding of circadian oscillation in the absence of canonical TTFL by showing the redox cycle in red blood cells (O’Neil, Nature 2011). The presence of circadian oscillation in the absence of Bmal1 is also reported recently(Ray, Science 2020). Bmal1(-CLOCK), CRY, and PER compose the core mechanism of canonical circadian TTFL; thus, this manuscript put another layer of evidence for the non-TTFL circadian oscillation in mammals. Overall, the manuscript reports several surprising results that will receive considerable attention from the circadian community. This reviewer has expertise in the field of mammalian circadian clocks, including genomics, biochemistry, and mice's behavior analysis.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)): In the canonical model of the mammalian circadian system, transcription factors, BMAL1/CLOCK, drive transcription of Cry and Per genes and CRY and PER proteins repress the BMAL1/CLOCK activity to close the feedback loop in a circadian cycle. The dominant opinion was that CRY1 and CRY2 are essential repressors of the mammalian circadian system. However, this was challenged by persistent bioluminescence rhythms observed in SCN slices derived from Cry-null mice (Maywood et al., 2011 PNAS) and then by persistent behavior rhythms shown by the Cry1 and Cry2 double knockout mice if they are synchronized under constant light prior to free running in the dark (Ono et al., 2013 PLOS One). In the manuscript, the authors first confirmed behavioral and molecular rhythms in the Cry1/Cry2- deficient mice and then provided evidence to suggest the rhythms of Per2:LUC and Nr1d1:LUC in CKOs are generated from the cytoplasmic oscillator instead of the well-studied transcription and translation feedback loop: Constant Per2 transcription driven by BMAL1/CLOCK plus rhythmic degradation of the PER protein result in a rhythmic PER2 level in the absence of both Cry1 and Cry2, which suggests a connection between the classic transcription- and translation-based negative feedback loops and non-canonical oscillators. **Major points:** Line 38-39, "Challenging this interpretation, however, we find evidence for persistent circadian rhythms in mouse behavior and cellular PER2 levels when CRY is absent." The rhythmic behavioral phenotype of cry1 and cry2 double knockout mice was first documented by Ono et al., 2013 PLOS ONE, in which eight cry1 and cry2 double knockout mice after synchronization in the light displayed circadian periods with different lengths and qualities. The paper reported two period lengths from the Cry mutant mice: "An eye-fitted regression line revealed that the mean shorter period was 22.86+/-0.4 h (n= 8) and the mean longer period was 24.66+/-0.2 h (n =9). The difference of two periods was statistically significant (p, 0.01).", either of which is quite different from the ~16.5 hr period in Figure 1B of the manuscript. A brief discussion on the period difference between studies will be helpful for readers to understand. Period information from the individual mouse should be calculated and shown since big period variations exist among CKO mice (Ono et al., 2013 PLOS One).

      Thanks for this suggestion. The mice used by Ono et al were raised from birth in constant light, whereas we used mice that were weaned and raised in normal LD cycles before being subject to constant light then constant dark as adults. Instead of the somewhat subjective fitting of regression lines by eye performed by Ono et al, our analysis was performed using the periodogram analysis routine of ClockLab 6.0 with a significance threshold for rhythmicity of p=0.0001. We have now repeated this experiment with 10 adult CKO mice (male and female), and found no evidence for two period lengths in that the second most significant period was consistently double that of the first. As the reviewer suggests, there is a much broader distribution of CKO mouse periods compared with WT, as we also found in cultured cells and SCN. These new data are now reported in revised Figure S1B & C. We have also included a statement about how our study differs from Ono et al in the supplementary discussion.

      The behavioral phenotype of Cry-null mice and luminescence from their SCNs are robustly rhythmic while fibroblasts derived from these mice only produce rhythms with very low amplitudes compared with those in WT, which may reflect the difference between the SCN’s rhythm and peripheral clocks. The behavioral phenotype is supposed to be controlled mainly by SCN. However, most molecular analyses in the work were done with MEF and lung fibroblasts. These tissues may not be the best representative of the behavioral phenotype of the CKO mice.

      Behavioural rhythms of CKO mice are significantly less robust than WT, with mean amplitude less than 50% of WT controls (Figures 1A & B, revised S1B. Furthermore, as reported, 40% of CKO SCN slices exhibited PER2::LUC rhythms, compared with 100% of WT SCN slices (as also observed by Maywood et al., PNAS, 2013), and therefore are also less robust by the definition used in this manuscript.

      As now discussed in the revised supplementary discussion:

      Circadian timekeeping is a cellular phenomenon. Co-ordinated ~24h rhythms in behaviour and physiology are observed in multi-cellular mammals under non-stressed conditions when individual cellular rhythms are synchronised and amplified by appropriate extrinsic and intrinsic timing cues.”

      The objective of this study was to understand the fundamental determinants that allow mammalian cells to generate a circadian rhythm, which we find does not include an essential role for CRY genes/proteins. Thus the cell is the appropriate level of biological abstraction at which to investigate the phenomenon, whereas the SCN and behavioural recordings simply serve to illustrate the competence of CRY-independent timing mechanisms to co-ordinate biological rhythms at higher levels of biological scale which are manifest under some conditions. To reiterate, the behavioural data supports the cellular observations, not the converse.

      Stronger evidence is needed to fully exclude the possibility that in CKO cells, the rhythm is not generated by PERs' compensation for the loss of Crys to repress BMAL1 and CLOCK. Since the rhythms of Per:LUC or Nr1d1:LUC (Figures 3D and S3E) are much weaker than those in WT, molecular analyses might not be sensitive enough to reflect the changes across a circadian cycle in the CKOs if the TTFL still occurs. CLOCKΔ19 mutant mice have a ~4 hr longer period than WT (Antoch et al., 1997 Cell; King et al., 1997 Cell). CLOCKΔ19; CKO cells or mice should be very helpful to address the question. Periods of Per:LUC and Nr1d1:LUC from the CLOCKΔ19; CKO should be similar to those in the CKO alone if the transcription feedback does not contribute to their oscillations.

      We agree this would be an interesting experiment, however the data in this manuscript and Wong et al. (BioRxiv, 2020), whilst not disputing the existence of the TTFL, strongly suggest that it fulfils a different function to that which is currently accepted and is not the mechanism that ultimately confers circadian periodicity upon mammalian cells. CLOCKΔ19 is an antimorphic gain-of-function mutation with many pleiotropic effects. Therefore, if the TTFL is not the basis of circadian timekeeping in mammalian cells, it follows that the CLOCKΔ19 mutation may not elicit its effects on circadian rhythms through delaying the timing of transcriptional activation, as was proposed. As such, whether or not CLOCKΔ19 alters circadian period of CKO cells/mice would not allow the two models to be distinguished in the way that the reviewer envisions.

      Secondly, we cannot detect any interaction between PER2 and BMAL1 in the absence of CRY using an extremely sensitive assay.

      Thirdly, very strong biochemical evidence suggests that PER has no repressive function in the absence of CRY (Chiou et al., 2016; Kume et al., 1999; Ode et al., 2017; Sato et al., 2006).

      Finally, in several figures particularly 3C and 4A, we show that PER2 peaks at the same time CKO and WT cells, but in CKO cells this is not accompanied by a coincident peak in the mRNA. Thus, even if PER were able to repress BMAL1/CLOCK without CRY, rhythms in PER2 protein level could not be explained by some residual PER/BMAL1-dependent TTFL mechanism.

      To address the reviewer’s concern however, we have employed mouse red blood cells which offer unambiguous insight into the causal determinants of circadian timing, as we can be absolutely confident that there is no transcriptional contribution to cellular timekeeping. Briefly, we took fibroblasts and RBCs from WT, short period Tau/Tau and long period Afh/Afh mutant mice. The basis of the circadian phenotype of these mutations is quite well established as occurring through the post-translational regulation of PER and CRY proteins respectively, and result in short and long period PER2::LUC rhythms compared with WT fibroblasts. RBCs do not express PER or CRY proteins, and commensurately no genotype-dependent differences of RBC circadian period were observed (Beale et al, 2020, in submission). In contrast, RBC circadian rhythms are sensitive to pharmacological inhibition of casein kinase 1 (Beale et al., JBR, 2019).

      Lines 51-52, "PER/CRY-mediated negative feedback is dispensable for mammalian circadian timekeeping" and lines 310-311, "We found that transcriptional feedback in the canonical TTFL clock model is dispensable for cell-autonomous circadian timekeeping in animal and cellular models." The authors have not excluded the possibility that the rhythmic behaviors of the CKO mice are derived from the PERs' compensation for the role of Crys in the feedback loop of the circadian clock in the SCN. In the fibroblasts, only two genes, Per2 and Nr1d1, have been studied in the work, which cannot be simply expanded to the thousands of circadian controlled genes. Also amplitudes of PER2:LUC and NR1D1:LUC in the CKOs are much lower than those in WT and no evidence has been provided to show that their weak rhythms are biologically relevant.

      The definition of a circadian rhythm (Pittendrigh, 1960) does not mention biological relevance or stipulate any lower threshold for amplitude. As now stated in the revised text (page 6):

      PER2::LUC rhythms in CKO cells were temperature compensated (Figure 2A, B) and entrained to 12h:12h 32°C:37°C temperature cycles in the same phase as WT controls (Figures 2C), and thus conform to the classic definition of a circadian rhythm (Pittendrigh, 1960) – which does not stipulate any lower threshold for amplitude or robustness.

      We make no claims about biological relevance or amplitude in this manuscript, which are addressed in our related manuscript (Wong et al., BioRxiv, 2020). In this related manuscript, we explicitly address whether CRY is necessary for mammalian cells to maintain a circadian rhythm in the abundance of clock-controlled proteins and find that it is not. Indeed, twice as many rhythmically abundant proteins are observed in CKO cells than WT controls, which suggests that, if anything, CRY functions to suppress rhythms in protein abundance rather than to generate them.

      We observe circadian rhythms in the activity of two different bioluminescent reporters, which have already been extensively characterised. The mouse and SCN data in figure 1 are correlative, and simply show that previous published observations are reproducible. PER2::LUC oscillations are not accompanied by Per2 mRNA oscillations. This, together with the absence of a BMAL1-PER2::LUC complex strongly argues against a model where PER2 oscillations are driven by residual (PER2-driven) transcriptional oscillations.

      We therefore concede the reviewer’s point that we “cannot exclude rhythmic behaviors of the CKO mice are derived from the PERs' compensation for the role of Crys in the feedback loop of the circadian clock in the SCN”. The reviewer will agree however, that there exists very strong biochemical evidence suggests that PER has no repressive function in the absence of CRY (Chiou et al., 2016; Kume et al., 1999; Ode et al., 2017; Sato et al., 2006); that there exists no experimental evidence to suggest that PERs can fulfil this function in the absence of CRY in any mammalian cellular context; and finally that our observations are not consistent with the canonical model for the generation of circadian rhythms in mammals.

      We have therefore amended the text to focus on CRY specifically, as follows:

      PER/CRY-mediated negative feedback is dispensable for mammalian circadian timekeeping

      Page 12. “We found that CRY-mediated transcriptional feedback in the canonical TTFL clock model is dispensable for cell-autonomous circadian timekeeping in cellular models. Whilst we cannot exclude the possibility that in the SCN, but not fibroblasts, PER alone may be competent to effect transcriptional feedback repression in the absence of CRY, we are not aware of any evidence that would render this possibility biochemically feasible.”

      **Minor points:** Lines 66-67, "...(Dunlap, 1999; Reppert and Weaver, 2002; Takahashi, 2016)." to "... (reviewed in Dunlap, 1999; Reppert and Weaver, 2002; Takahashi, 2016)."

      Thanks, changed as requested.

      Line 70, "...((Liu et al., 2008..." to "...(Liu et al., 2008..."

      Thanks, changed as requested.

      Lines 174-175, "Considering recent reports that transcriptional feedback repression is not absolutely required for circadian rhythms in the activity of FRQ...". Larrondo et al., 2015 paper says "however, in such ∆fwd-1 cells, the amount of FRQ still oscillated, the result of cyclic transcription of frq and reinitiation of FRQ synthesis." The point of the paper is "we unveiled an unexpected uncoupling between negative element half-life and circadian period determination." instead of "...transcriptional feedback repression is not absolutely required for circadian rhythms in the activity of FRQ,"

      This is a good point which, following discussion with Profs Dunlap and Larrondo, we have revised into “no obligate relationship between clock protein turnover and circadian regulation of its activity” – a more accurate summary of their findings.

      Lines 249-252, "CKO cells exhibit no rhythm in Per2 mRNA (Figure 3C, D), nor do they show a rhythm in global translational rate (Figure S4A, B), nor did we observe any interaction between BMAL1 and S6K/eIF4 as occurs in WT cells (Lipton et al, 2015) (Figure S4C)." In figures 3D and S3E, in CKO and CPKO cells the Per2:LUC data without fitting look better than that of Nr1d1:LUC. But the Nr1d1:LUC rhythm became clear after fitting the raw data. So to better visualize the low amplitude rhythm, if any, of Per2:LUC and compare with Nr1d1:LUC, fitted the Per2:LUC data in CKOs and CPKOs in Figure 3D and S3E should be shown as what has been done to Nr1d1:LUC.

      Thanks, these data can be found in Figure S3F. The detrended Per2:Luc CKO and CPKO bioluminescence traces were better fit by the null hypothesis (straight line) than a damped sine wave (p>0.05) and so were not significantly rhythmic by the criteria used in this manuscript.

      Lines 258-259, "much less than the half-life of luciferase expressed in fibroblasts under a constitutive promoter" In figure S4D, the y-axis of the PER2::LUC is ~800 while the y-axis of the SV40::LUC is ~600000. The over-expressed LUC by the SV40 promoter might saturate the degradation system in the cell so the comparison is not fair. A weaker promoter with the level similar to Per2 should be used to make the comparison.

      Thank you for this suggestion. In our experience, the SV40 promoter is actually a rather weak promoter compared with CMV, and faithfully facilitates the constitutive (non-rhythmic) expression of heterologous proteins such as Luciferase (Feeney et al., JBR, 2016). It has been shown previously that constitutive over-expression of heterologous proteins such as GFP or even CRY1 does not affect circadian rhythms in fibroblast cells (e.g. Chen et al., Mol Cell, 2009). To address the reviewer’s reasonable concern however, multiple stable SV40:Luc fibroblast lines were generated by puromycin selection, grown to confluence in 96-well plates, then treated with 25 μg/mL CHX at the beginning of the recording. Random genomic integration of SV40:Luc leads to a broad range of different levels of luciferase expression, evident from the broad range of initial luciferase activities. For each line the decline in luciferase activity was fit with a simple one-phase exponential decay curve (R2≥0.98) to derive the half-life of luciferase in each cell line. There was no significant relationship between the level of luciferase expression and luciferase stability (straight line vs. horizontal line fit p-value = 0.82). Therefore constitutive expression of SV40:Luc in fibroblasts does affect the cellular protein degradation machinery within the range of expression used for our half-life measurements. These new data are reported in Revised Figure S3H.

      Line 430, "sigma" to "Sigma".

      Changed

      In figure S2, the classification of rhythms in Drosophila is not clear since even the "Robustly rhythmic" ones have high background noise. Detrending or fitting the data might be able to improve the quality of the rhythms prior to classification.

      These are noisy data as they come from freely behaving flies. The mean data was shown in Figure S3A and individual examples in S3B, and look very similar to previous bioluminescence fly recordings of XLG-LUC flies in papers from the Stanewsky lab who have published extensively using this model. The classifications arose from double-blinded analysis of the bioluminescence traces by several individuals, but we agree that this was not clearly communicated in our original submission. In Revised figure S2 we now present the mean bioluminescence traces, with and without damped sine wave vs. straight line fitting, as suggested, which is more consistent with the mammalian cellular data presented elsewhere.

      In figure S3B, the original blots for Per2 including Input and IP should be shown.

      The original blots for BMAL1 are shown in figure S3I. PER2::LUC levels were assessed by measuring bioluminescence levels present on the anti-bmal1-beads, as described in the figure 3B legend.

      Supplemental information Line 44, "...(reviewed in (Lakin-Thomas,..." to "...(reviewed in Lakin-Thomas,..."

      Changed

      Line 188, "Period CDS", the full name of CDS should be provided the first time it appearances.

      Changed to “coding sequence”.

      Reviewer #2 (Significance (Required)): The work suggests a link between the TTFL and non-canonical oscillators, which should be interesting to the circadian field.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)): **Summary:** The paper "CRYPTOCHROMES confer robustness, not rhythmicity, to circadian timekeeping" by Putker et al. answers the question of whether or not the rhythmic abundance of clock proteins is a prerequisite for circadian timekeeping. They addressed this by monitoring PER2::LUC rhythms in WT and CRY KO (CKO) cells. CRY forms a complex with PER, which in turn represses the ability of CLOCK/BMAL1 to drive the expression of clock-controlled genes, including PER and CRY. Consistent with previous observations, the authors found residual PER2::LUC rhythms in CKO SCN slices, fibroblasts and in a functional analogue KO of CRY in Drosophila, even in the absence of rhythmic Per2 transcription due to the loss of CRY as a negative regulator of the oscillation. They have shown that these rhythms, in the absence of CRY, follow the formal definition of circadian rhythms. They attributed these residual PER2::LUC rhythms to the maintenance of oscillation in PER2::LUC stability independent of CRY, by testing the decay kinetics of luciferase activity when translation is inhibited. Moreover, they implicated the kinases CK1d/e and GSK3 to be involved in regulating PER2::LUC post-translational rhythms through kinase inhibitor studies. They concluded that CRY is not necessary for maintaining PER2::LUC rhythms, but plays an important role in reinforcing high-amplitude rhythms when coupled to a proposed "ctyoscillator" likely composed of CK1d/e and GSK3. **Major comments:** The authors have shown sufficient data that under different testing conditions (mice locomotor activity, SCN preps or fibroblasts), behavioral rhythms and PER2::LUC rhythms are still observed in the CRY KO (CKO) cells, contrary to a previous study (Liu et al., 2007). They also indicated limitations to some of the.experimental work. However, there are some parts of the paper that need clarification to support their conclusions. 1.In Fig. 1A, the x-axes of the actograms for WT and CKO are different. While they mentioned this in the figure legend, and described the axis transformation in Fig. S1A, they need a justification statement about why they did this in the results.

      Thanks, we have included the following sentence in the results section as requested:

      Figure 1 representative actograms are plotted as a function of endogenous tau (**t) to allow the periodic organisation of rest-activity cycles to be readily discerned; 24h-plotted actograms are shown in Figure S1A and S2A

      2.In an attempt to show conservation of their proposed role for CRY, they tested the model system Drosophila melanogaster where TIMELESS serves as the functional analogue of CRY. While they showed in the figures and described in the text that rhythms still persisted with lower relative amplitude in the TIMELESS-deficient flies, they did not describe any period differences between WT and mutant. Showing the period quantification in Supp. Fig. S2 using the robustly rhythmic datasets, and describing this data in the text, will strengthen their claim.

      These analyses are now reported in revised Figure S2 as requested. As described in our response to reviewer 2, the “robustly rhythmic” flies were scored as such through double-blinded analysis by several individuals. We hope the reviewer will appreciate our concern that exclusion of the majority of TIMELESS-deficient flies that were not robustly rhythmic might skew their apparent period by unconscious bias towards favouring traces that most clearly resemble robustly rhythmic WT controls. To avoid any potential bias we therefore included all flies of both genotypes in the analysis of circadian period for the revised figure, as suggested by our other reviewers.

      In Fig. S2B, there is no clear distinction between the representative datasets shown for poorly rhythmic and arrhythmic, i.e. they all appear arrhythmic, without an indicated statistical test. The authors could present better representative data to better reflect the categories.

      As described above, we now show the grouped mean with and without fitting for all flies of both genotypes. The statistical test for rhythmicity and analysis of circadian period is now the same as was performed for the cellular data presented elsewhere.

      3.In Fig. 2A, the authors note the lack of rhythmicity in the CKO fibroblasts in the 1st three days at 37oC. How are the conditions here different from fibroblasts in Fig. 1E, where rhythms are seen during the 1st three days in CKO fibroblasts?

      As discussed in the manuscript, PER2::LUC rhythms in CKO cells and SCN are observed stochastically between recordings i.e. if one dish in a recording showed rhythms, all dishes showed rhythms and vice versa. The media change that occurred after 3 days in Fig 2A, in this case, was sufficient to initiate clear rhythms of PER2::LUC in all experimental replicates. In other experiments, media change did not have this effect. Herculean efforts by multiple lab members over many years, including the PI, have been unable to delineate the basis of this variability – which is discussed at length in the thesis of Dr. David Wong https://www.repository.cam.ac.uk/handle/1810/300610. As such, we clearly state in the discussion:

      We were unable to identify all of the variables that contribute to the apparent stochasticity of CKO PER2::LUC oscillations, and so cannot distinguish whether this variability arises from reduced fidelity of PER2::LUC as a circadian reporter or impaired timing function in CKO cells. In consequence, we restricted our study to those recordings in which clear bioluminescence rhythms were observed, enabling the interrogation of TTFL-independent cellular timekeeping.”

      1. The authors claimed in the results section- "in contrast and as expected, Per2 mRNA in WT cells varied in phase with co-recorded PER2::LUC oscillations." but Fig. 3C does not show this expected lag between mRNA and protein levels. This needs to be explained

      No lag is expected in vitro. A lag between PER protein levels and Per mRNA does occur in vivo and is very likely to attributable to daily rhythms in feeding (Crosby et al, Cell, 2019), where increased insulin signalling elicits an increase in PER protein production 4-6h after E-box and GRE-stimulated increase in Per transcription.

      When luciferin is saturating intracellularly, PER2::LUC activity correlates most closely with the amount of PER2::LUC protein that was translated during the preceding 1-2h, rather than the total amount of PER2, due to the enzymatic inactivation of the luciferase protein (Feeney et al, JBR, 2016). Consistent with many previous observations, under constant conditions, the rate of nascent PER protein synthesis is largely determined by the level of Per2 mRNA, and thus more similar phases are observed between protein and mRNA in vitro than in vivo.

      We have inserted an additional citation of Feeney et al at this point in the text to make this clear.

      5.In Figs. 5A-B, the PER2::LUC periods in the CKO untreated cells seem to vary significantly between A, B, and C. While this could be due to the high variability in the rhythms that were previously described by the authors, the average periods here seem to be longer than the one reported in Fig. 1F. Are there specific condition differences?

      There are no specific condition differences. As reported in Figure S1B, D & E, the range of CKO cellular periods is simply much broader than for WT cells. Over several dozen experiments the average period was significantly shorter, but the period variance is an equally striking feature of rhythms in these cells which we take as evidence for their lack of robustness.

      *Would additional experiments be essential to support the claims of the paper?*

      1. There is sufficient experimental data to support the major claims; however some suggested experiments are listed below.

        a. If CKO exhibits residual rhythms in PER::LUC, it would be interesting to know how CRY overexpression influences PER2::LUC rhythms, or point to previous reference papers which may have already shown such effects. The prediction would be PER2::LUC levels will still be rhythmic when CRY is overexpressed. What would be the extent of "robustness" conferred by CRY on PER2::LUC rhythms based on CRY KO and overexpression studies?

      These experiments have largely already been performed (see Chen et al., Mol Cell; Nangle et al., eLife, 2014; Fan et al., Curr Biol, 2007; Edwards et al., PNAS, 2016) and are cited in this manuscript. As suggested, PER2 rhythms remain intact under CRY1 over-expression, though are clearly perturbed, but their robustness was not investigated in any detail. We hope to be able to address this important question in our subsequent work

      The authors found that CK1d/e and GSK3 contribute to CRY-independent PER2 oscillations by showing that addition of kinase inhibitors affect the PER2::LUC period lengths in WT and CKO in the same manner. It would be interesting to know if a) PER2::LUC stability and b) PER2 phosphorylation status, is affected in WT and CKO in the presence of the inhibitors, or point to previous reference papers which may already have shown such effects.

      As the reviewer points out, PER2 stability is already reported to be regulated via phosphorylation by GSK3 and CK1. We have made explicit reference to this in the revised manuscript as follows:

      In contemporary models of the mammalian cellular clockwork CRY proteins are essential for rhythmic PER protein production, however, the stability and activity of PER proteins are also regulated post-translationally (Lee et al., 2009; Philpott et al., 2020; Iitaka et al, 2005).”

      *Are the data and the methods presented in such a way that they can be reproduced?*

      1. The protocol for the inhibitor treatments are not in the main or supplemental methods.

      In the main text methods, section luciferase recordings we state: “For pharmacological perturbation experiments (unless stated otherwise in the text) cells were changed into drug-containing air medium from the start of the recording. Mock-treatments were carried out with DMSO or ethanol as appropriate.”

      *Are the experiments adequately replicated and statistical analysis adequate?*

      1. All experiments had the sufficient number of technical and biological replicates to make valid statistical analyses. For Fig. S2, the authors used RAIN to assess rhythmicity in WT and mutant flies, but it is not clear whether the different categories (rhythmic, poorly rhythmic, and arrhythmic) were based on amplitude differences alone, or a combination of amplitude and p-values as determined by RAIN.

      As reported above, we have revised the analysis of the fly data to be consistent with the cellular data reported elsewhere in the manuscript.

      **Minor comments:** *1. Are prior studies referenced appropriately?* Authors may wish to include Fan et al., 2007, Current Biology which demonstrated that cycling of CRY1, CRY2, and BMAL1 is not necessary for circadian-clock function in fibroblasts.

      Apologies for the omission of citation to this excellent paper. Now referenced in the introduction.

      *2. Are the text and figures clear and accurate?* Figures were clear and illustrated well. See minor comments on text below:

      1. Other minor comments

      Main Text: p3, line 62; p12, line l32: It doesn't seem necessary or appropriate to cite the dictionary for the definition of robust.

      Thanks for this suggestion. During preparation of the manuscript we found that there was some disagreement between authors as to the meaning of robustness in a circadian context. We therefore feel it most necessary to define clearly what we mean by the use of this word to avoid any potential ambiguity.

      p4, line l87: "~20 h" rhythms instead of "~20h-hour" p3, line 70; p5, line 121; p14, line 380; p16, line 416 and p18, line 458: Close parentheses have been doubled in parenthetical references. p14, line 363: "crassa" instead of "Crassa" p17, line 430: "Sigma" instead of "sigma" p18, lines 464 and 483; p20, line 521: put a space between numerical values and units, to be consistent with other entries p19, line 488: "luciferase" instead of Luciferase p20, line 512: "Cell Signaling" instead of "cell signalling" p20, line 526: "single" instead of "Single"

      We thank the reviewer for his/her thoroughness, all of the above have been changed.

      Main figures: Fig. 2 p37, line 921: close parenthesis was doubled on "red"

      This was actually correct.

      Fig. 4 p41, line 989: "0.1 mM" instead of "0.1 mM" for consistency throughout text Supplementary text: line 171: "30 mM HEPES" instead of "30mM HEPES" line 184: "Cell Signaling" instead of "cell signalling" Supplementary figures: Fig. S2A "Drosophila melanogaster" instead of "Drosophila Melanogaster"

      All of the above have been changed.

      Reviewer #3 (Significance (Required)): This paper revisits the previously proposed idea that rhythmic expression of central TTFL components is not essential for circadian timekeeping to persist. However, this paper does not add a significant advance in the understanding of the underlying reasons behind sustained clock protein rhythmicity like PER in the absence of CRY, since such mechanisms in functional analogs have been shown in other systems, like Neurospora (Larrondo et al., 2015). However, this paper does clarify some issues in the field, such as discrepancies between behavioral and cellular rhythms observed in CKO mice, leading future researchers to examine closely the conditions of their CKO rhythmic assays before making conclusions pertaining to rhythmicity. The identification of the kinases as components of the proposed cytosolic oscillator (cytoscillator) needs further validation, but this is perhaps beyond the scope of the paper. The data provides incremental evidence for the existence of a cytoscillator, but opens up opportunities to identify other players, like phosphatases, to establish the connection between the central TTFL and the proposed cytoscillator.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Summary:

      The paper "CRYPTOCHROMES confer robustness, not rhythmicity, to circadian timekeeping" by Putker et al. answers the question of whether or not the rhythmic abundance of clock proteins is a prerequisite for circadian timekeeping. They addressed this by monitoring PER2::LUC rhythms in WT and CRY KO (CKO) cells. CRY forms a complex with PER, which in turn represses the ability of CLOCK/BMAL1 to drive the expression of clock-controlled genes, including PER and CRY. Consistent with previous observations, the authors found residual PER2::LUC rhythms in CKO SCN slices, fibroblasts and in a functional analogue KO of CRY in Drosophila, even in the absence of rhythmic Per2 transcription due to the loss of CRY as a negative regulator of the oscillation. They have shown that these rhythms, in the absence of CRY, follow the formal definition of circadian rhythms. They attributed these residual PER2::LUC rhythms to the maintenance of oscillation in PER2::LUC stability independent of CRY, by testing the decay kinetics of luciferase activity when translation is inhibited. Moreover, they implicated the kinases CK1and GSK3 to be involved in regulating PER2::LUC post-translational rhythms through kinase inhibitor studies. They concluded that CRY is not necessary for maintaining PER2::LUC rhythms, but plays an important role in reinforcing high-amplitude rhythms when coupled to a proposed "ctyoscillator" likely composed of CK1and GSK3.

      Major comments:

      The authors have shown sufficient data that under different testing conditions (mice locomotor activity, SCN preps or fibroblasts), behavioral rhythms and PER2::LUC rhythms are still observed in the CRY KO (CKO) cells, contrary to a previous study (Liu et al., 2007). They also indicated limitations to some of the.experimental work. However, there are some parts of the paper that need clarification to support their conclusions.

      1.In Fig. 1A, the x-axes of the actograms for WT and CKO are different. While they mentioned this in the figure legend, and described the axis transformation in Fig. S1A, they need a justification statement about why they did this in the results.

      2.In an attempt to show conservation of their proposed role for CRY, they tested the model system Drosophila melanogaster where TIMELESS serves as the functional analogue of CRY. While they showed in the figures and described in the text that rhythms still persisted with lower relative amplitude in the TIMELESS-deficient flies, they did not describe any period differences between WT and mutant. Showing the period quantification in Supp. Fig. S2 using the robustly rhythmic datasets, and describing this data in the text, will strengthen their claim.

      In Fig. S2B, there is no clear distinction between the representative datasets shown for poorly rhythmic and arrhythmic, i.e. they all appear arrhythmic, without an indicated statistical test. The authors could present better representative data to better reflect the categories.

      3.In Fig. 2A, the authors note the lack of rhythmicity in the CKO fibroblasts in the 1st three days at 37oC. How are the conditions here different from fibroblasts in Fig. 1E, where rhythms are seen during the 1st three days in CKO fibroblasts?

      1. The authors claimed in the results section- "in contrast and as expected, Per2 mRNA in WT cells varied in phase with co-recorded PER2::LUC oscillations." but Fig. 3C does not show this expected lag between mRNA and protein levels. This needs to be explained

      5.In Figs. 5A-B, the PER2::LUC periods in the CKO untreated cells seem to vary significantly between A, B, and C. While this could be due to the high variability in the rhythms that were previously described by the authors, the average periods here seem to be longer than the one reported in Fig. 1F. Are there specific condition differences?

      Would additional experiments be essential to support the claims of the paper?

      1. There is sufficient experimental data to support the major claims; however some suggested experiments are listed below.

      a. If CKO exhibits residual rhythms in PER::LUC, it would be interesting to know how CRY overexpression influences PER2::LUC rhythms, or point to previous reference papers which may have already shown such effects. The prediction would be PER2::LUC levels will still be rhythmic when CRY is overexpressed. What would be the extent of "robustness" conferred by CRY on PER2::LUC rhythms based on CRY KO and overexpression studies?

      b. The authors found that CK1and GSK3 contribute to CRY-independent PER2 oscillations by showing that addition of kinase inhibitors affect the PER2::LUC period lengths in WT and CKO in the same manner. It would be interesting to know if a) PER2::LUC stability and b) PER2 phosphorylation status, is affected in WT and CKO in the presence of the inhibitors, or point to previous reference papers which may already have shown such effects.

      Are the data and the methods presented in such a way that they can be reproduced?

      1. The protocol for the inhibitor treatments are not in the main or supplemental methods.

      Are the experiments adequately replicated and statistical analysis adequate?

      1. All experiments had the sufficient number of technical and biological replicates to make valid statistical analyses. For Fig. S2, the authors used RAIN to assess rhythmicity in WT and mutant flies, but it is not clear whether the different categories (rhythmic, poorly rhythmic, and arrhythmic) were based on amplitude differences alone, or a combination of amplitude and p-values as determined by RAIN.

      Minor comments:

      1. Other minor comments

      Main Text:

      p3, line 62; p12, line l32: It doesn't seem necessary or appropriate to cite the dictionary for the definition of robust.

      p4, line l87: "~20 h" rhythms instead of "~20h-hour"

      p3, line 70; p5, line 121; p14, line 380; p16, line 416 and p18, line 458: Close parentheses have been doubled in parenthetical references.

      p14, line 363: "crassa" instead of "Crassa"

      p17, line 430: "Sigma" instead of "sigma"

      p18, lines 464 and 483; p20, line 521: put a space between numerical values and units, to be consistent with other entries

      p19, line 488: "luciferase" instead of Luciferase

      p20, line 512: "Cell Signaling" instead of "cell signalling"

      p20, line 526: "single" instead of "Single"

      Main figures:

      Fig. 2 p37, line 921: close parenthesis was doubled on "red"

      Fig. 4 p41, line 989: "0.1 mM" instead of "0.1 mM" for consistency throughout text

      Supplementary text:

      line 171: "30 mM HEPES" instead of "30mM HEPES"

      line 184: "Cell Signaling" instead of "cell signalling"

      Supplementary figures:

      Fig. S2A "Drosophila melanogaster" instead of "Drosophila Melanogaster"

      Significance

      This paper revisits the previously proposed idea that rhythmic expression of central TTFL components is not essential for circadian timekeeping to persist. However, this paper does not add a significant advance in the understanding of the underlying reasons behind sustained clock protein rhythmicity like PER in the absence of CRY, since such mechanisms in functional analogs have been shown in other systems, like Neurospora (Larrondo et al., 2015). However, this paper does clarify some issues in the field, such as discrepancies between behavioral and cellular rhythms observed in CKO mice, leading future researchers to examine closely the conditions of their CKO rhythmic assays before making conclusions pertaining to rhythmicity. The identification of the kinases as components of the proposed cytosolic oscillator (cytoscillator) needs further validation, but this is perhaps beyond the scope of the paper. The data provides incremental evidence for the existence of a cytoscillator, but opens up opportunities to identify other players, like phosphatases, to establish the connection between the central TTFL and the proposed cytoscillator.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      In the canonical model of the mammalian circadian system, transcription factors, BMAL1/CLOCK, drive transcription of Cry and Per genes and CRY and PER proteins repress the BMAL1/CLOCK activity to close the feedback loop in a circadian cycle. The dominant opinion was that CRY1 and CRY2 are essential repressors of the mammalian circadian system. However, this was challenged by persistent bioluminescence rhythms observed in SCN slices derived from Cry-null mice (Maywood et al., 2011 PNAS) and then by persistent behavior rhythms shown by the Cry1 and Cry2 double knockout mice if they are synchronized under constant light prior to free running in the dark (Ono et al., 2013 PLOS One). In the manuscript, the authors first confirmed behavioral and molecular rhythms in the Cry1/Cry2- deficient mice and then provided evidence to suggest the rhythms of Per2:LUC and Nr1d1:LUC in CKOs are generated from the cytoplasmic oscillator instead of the well-studied transcription and translation feedback loop: Constant Per2 transcription driven by BMAL1/CLOCK plus rhythmic degradation of the PER protein result in a rhythmic PER2 level in the absence of both Cry1 and Cry2, which suggests a connection between the classic transcription- and translation-based negative feedback loops and non-canonical oscillators.

      Major points:

      Line 38-39, "Challenging this interpretation, however, we find evidence for persistent circadian rhythms in mouse behavior and cellular PER2 levels when CRY is absent." The rhythmic behavioral phenotype of cry1 and cry2 double knockout mice was first documented by Ono et al., 2013 PLOS ONE, in which eight cry1 and cry2 double knockout mice after synchronization in the light displayed circadian periods with different lengths and qualities. The paper reported two period lengths from the Cry mutant mice: "An eye-fitted regression line revealed that the mean shorter period was 22.86+/-0.4 h (n= 8) and the mean longer period was 24.66+/-0.2 h (n =9). The difference of two periods was statistically significant (p, 0.01).", either of which is quite different from the ~16.5 hr period in Figure 1B of the manuscript. A brief discussion on the period difference between studies will be helpful for readers to understand. Period information from the individual mouse should be calculated and shown since big period variations exist among CKO mice (Ono et al., 2013 PLOS One).

      The behavioral phenotype of Cry-null mice and luminescence from their SCNs are robustly rhythmic while fibroblasts derived from these mice only produce rhythms with very low amplitudes compared with those in WT, which may reflect the difference between the SCN's rhythm and peripheral clocks. The behavioral phenotype is supposed to be controlled mainly by SCN. However, most molecular analyses in the work were done with MEF and lung fibroblasts. These tissues may not be the best representative of the behavioral phenotype of the CKO mice.

      Stronger evidence is needed to fully exclude the possibility that in CKO cells, the rhythm is not generated by PERs' compensation for the loss of Crys to repress BMAL1 and CLOCK. Since the rhythms of Per:LUC or Nr1d1:LUC (Figures 3D and S3E) are much weaker than those in WT, molecular analyses might not be sensitive enough to reflect the changes across a circadian cycle in the CKOs if the TTFL still occurs. CLOCKΔ19 mutant mice have a ~4 hr longer period than WT (Antoch et al., 1997 Cell; King et al., 1997 Cell). CLOCKΔ19; CKO cells or mice should be very helpful to address the question. Periods of Per:LUC and Nr1d1:LUC from the CLOCKΔ19; CKO should be similar to those in the CKO alone if the transcription feedback does not contribute to the their oscillations.

      Lines 51-52, "PER/CRY-mediated negative feedback is dispensable for mammalian circadian timekeeping" and lines 310-311, "We found that transcriptional feedback in the canonical TTFL clock model is dispensable for cell-autonomous circadian timekeeping in animal and cellular models." The authors have not excluded the possibility that the rhythmic behaviors of the CKO mice are derived from the PERs' compensation for the role of Crys in the feedback loop of the circadian clock in the SCN. In the fibroblasts, only two genes, Per2 and Nr1d1, have been studied in the work, which cannot be simply expanded to the thousands of circadian controlled genes. Also amplitudes of PER2:LUC and NR1D1:LUC in the CKOs are much lower than those in WT and no evidence has been provided to show that their weak rhythms are biologically relevant.

      Minor points:

      Lines 66-67, "...(Dunlap, 1999; Reppert and Weaver, 2002; Takahashi, 2016)." to "... (reviewed in Dunlap, 1999; Reppert and Weaver, 2002; Takahashi, 2016)."

      Line 70, "...((Liu et al., 2008..." to "...(Liu et al., 2008..."

      Lines 174-175, "Considering recent reports that transcriptional feedback repression is not absolutely required for circadian rhythms in the activity of FRQ...". Larrondo et al., 2015 paper says "however, in such ∆fwd-1 cells, the amount of FRQ still oscillated, the result of cyclic transcription of frq and reinitiation of FRQ synthesis." The point of the paper is "we unveiled an unexpected uncoupling between negative element half-life and circadian period determination." instead of "...transcriptional feedback repression is not absolutely required for circadian rhythms in the activity of FRQ,"

      Lines 249-252, "CKO cells exhibit no rhythm in Per2 mRNA (Figure 3C, D), nor do they show a rhythm in global translational rate (Figure S4A, B), nor did we observe any interaction between BMAL1 and S6K/eIF4 as occurs in WT cells (Lipton et al, 2015) (Figure S4C)." In figures 3D and S3E, in CKO and CPKO cells the Per2:LUC data without fitting look better than that of Nr1d1:LUC. But the Nr1d1:LUC rhythm became clear after fitting the raw data. So to better visualize the low amplitude rhythm, if any, of Per2:LUC and compare with Nr1d1:LUC, fitted the Per2:LUC data in CKOs and CPKOs in Figure 3D and S3E should be shown as what has been done to Nr1d1:LUC.

      Lines 258-259, "much less than the half-life of luciferase expressed in fibroblasts under a constitutive promoter" In figure S4D, the y-axis of the PER2::LUC is ~800 while the y-axis of the SV40::LUC is ~600000. The over-expressed LUC by the SV40 promoter might saturate the degradation system in the cell so the comparison is not fair. A weaker promoter with the level similar to Per2 should be used to make the comparison.

      Line 430, "sigma" to "Sigma".

      In figure S2, the classification of rhythms in Drosophila is not clear since even the "Robustly rhythmic" ones have high background noise. Detrending or fitting the data might be able to improve the quality of the rhythms prior to classification.

      In figure S3B, the original blots for Per2 including Input and IP should be shown.

      Supplemental information

      Line 44, "...(reviewed in (Lakin-Thomas,..." to "...(reviewed in Lakin-Thomas,..."

      Line 188, "Period CDS", the full name of CDS should be provided the first time it appearances.

      Significance

      The work suggests a link between the TTFL and non-canonical oscillators, which should be interesting to the circadian field.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary:

      This interesting study by Putker et al. showed that circadian rhythmicity persists in several typical circadian assay systems lacking Cry, including Cry knockout mouse behavior and gene expression in Cry knockout fibroblasts. They further demonstrated weak but significant circadian rhythmicity in Cry- and Per- knockout cells. Cry- (and potentially Per-)-independent oscillations are temperature compensated, and CKId/e still has a role in the period regulation of Cry-independent oscillations.

      Major comments:

      1) The authors propose that the essential role of mammalian Cryptochrome is to bring the robust oscillation. As the authors analyze in many parts, the robustness of oscillation can be validated by the (relative) amplitude and phase/period variation, both of which should be affected significantly by the method for cell synchronization. Unfortunately, the method for synchronization is not adequately written in this version of supplementary information. This reviewer has no objection to the "iterative refinement of the synchronization protocol" but at least the correspondence between which methods were used in which experiments needs to be clearly explained. The detailed method may be found in the thesis of Dr. Wong, but the methods used in this manuscript need to be detailed within this manuscript.

      2) The authors revealed that CKO mice have apparent behavioral rhythmicity under the condition of LL>DD. This is an intriguing finding. However, it should be carefully evaluated whether this rhythmicity (16 hr cycle) is the direct consequence of circadian rhythmicity observed in CKO and CPKO cells (24 hr cycle) because the period length is much different. Is it possible to induce the 16 hr periodicity in CKO mice behavior by 16 hr-L:16 hr-D cycle? Would it be a plausible another possibility that the 16 hr rhythmicity is the mice version of internal desynchronization or another type of methamphetamine-induced-oscillation/food-entrainable-oscillattion?

      3) The authors proposed that CKId/e at least in part is the component of cytoscillator (Fig. 5D), and turnover control of PER (likely to be controlled by CKId/e) may be an interaction point between cytoscillator and canonical circadian TTFL (Fig. 4). Strictly speaking, this model is not directly supported by the experimental setting of the current manuscript. The contribution of CKId/e is evaluated in the presence of PER by monitoring the canonical TTFL output (i.e. PER2::LUC); thus it is not clear whether the kinase determines the period of cytoscillator. It would be valuable to ask whether the PF and CHIR have the period-lengthening effect on the Nrd1:LUC in the CPKO cell.

      Minor comments:

      4) The authors argue that the CKO cells' rhythmicity is entrained by the temperature cycle (Fig. 2C). Because the data of CKO cell only shows one peak after the release of constant temperature phase, it is difficult to conclude whether the cell is entrained or just respond to the final temperature shift.

      5) It would be useful for readers to provide information on the known phenotype of TIMELESS knockout flies; TIM is widely accepted as an essential component of the circadian clock in flies; are there any studies showing the presence of circadian rhythmicity in Tim-knockout flies (even if it is an oscillation seen in limited conditions, such as the neonatal SCN rhythm in mammalian Cry knockout)?

      5) Figure 3C shows that the amount of PER2::LUC mRNA changes ~2 fold between time = 0 hr and 24 hr in the CKO cell. This amplitude is similar to that observed in WT cell although the peak phase is different. Does the PER2::LUC mRNA level show the oscillation in CKO cells?

      6) Figure 3D: the authors discuss the amplitude and variation (whether the signal is noisier or not) of reporter luciferase expression between different cell lines. However, a huge difference in the luciferase signal can be observed even in the detrended bioluminescence plot. This reviewer concerns that some of the phenotypes of CKO and CPKO MEF reflect the lower transfection efficiency of the reporter gene, not the nature of circadian oscillators of these cell lines.

      Significance

      Although Cryptochrome (Cry) has been considered a central component of the mammalian circadian clock, several studies have shown that circadian rhythms are maintained in the absence of Cry, including in the neonate SCN and red blood cells. Thus, although the need for Cry as a circadian oscillator has been debated, its essential role as a circadian oscillator remains established, at least in the cell-autonomous clock driven by the TTFL. This study provides additional evidence that the circadian rhythmicity can persist in the absence of Cry.

      More general context, the presence of a non-TTFL circadian oscillator has been one of the major topics in the field of circadian clocks except for the cyanobacteria. In mammals, the authors' and other groups lead the finding of circadian oscillation in the absence of canonical TTFL by showing the redox cycle in red blood cells (O'Neil, Nature 2011). The presence of circadian oscillation in the absence of Bmal1 is also reported recently(Ray, Science 2020). Bmal1(-CLOCK), CRY, and PER compose the core mechanism of canonical circadian TTFL; thus, this manuscript put another layer of evidence for the non-TTFL circadian oscillation in mammals.

      Overall, the manuscript reports several surprising results that will receive considerable attention from the circadian community.

      This reviewer has expertise in the field of mammalian circadian clocks, including genomics, biochemistry, and mice's behavior analysis.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      We thank the Reviewers for the positive assessment of our work and their insightful remarks. Please find below a point-by-point response to each comment.

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      Scheckel et al. report a large dataset on cell type-specific translational profiling of PrD-associated molecular alterations in the a mouse model thorough RiboTRAP and ribosome profiling approaches. They report a more severe alteration in the translatome specifically in astrocyte and microglia as compared to neuronal populations. This highlights that changes in these two cell classes might have a predominant role in the pathology of PrD.

      Data and the methods are presented such that they can be reproduced. The data analysis section of the manuscript could be further elaborated. In particular, it could be clarified which / how comparisons with existing dataset have been performed. Statistical analysis description is sometimes missing (e.g. fig 6e, not clear what the stars on top of the bars stands for, which test was performed and the significance). Moreover, the section of the methods regarding the western blots presented in figure 6 appear to be missing.

      Fig 6e shows the output (log2 fold change) of DESeq2. Genes with a Benjamini-Hochberg adjusted p value \*Major concern:**

      The most important improvement the authors should consider for their paper is to more specifically attempt to isolate specific effects on translational efficiency of mRNAs. As it stands, the authors largely use RiboTrap data as a reference to compare their footprinting data - but arguably, this misses mRNAs that are present in the transcriptome and not efficiently recruited onto ribosomes. It appears to be somewhat a lost opportunity to not attempt to test in the dataset (possibly by comparison to RNA-Seq from FACS isolated cells as a reference) whether there is a systematic change in translational efficiency (possibly in mRNAs with specific features?). In the current form, the RiboTrap and footprinting approaches largely serve to isolate mRNAs from cre-defined cell types but given the lack of a "total transcriptome" reference from the respective cells, it can not be easily interpreted whether certain transcripts are heavily regulated at the level of translation. Thus, despite using much more advanced methodologies than the Sorce study, the fundamental conclusions emerging from this work are rather similar to this previously published piece of work.

      Translational changes can be assessed in a cell-type specific manner without artefacts related to dissociation/isolation procedures and are arguably more relevant than transcriptional changes (Haimon et al., Nat. Immunol. 2018). Both, the assessment of translation as well as the investigation of specific cell types differentiates this study from transcriptional profiling studies including Sorce et al. Accordingly, our approach identified > 1000 cell-type specific translational changes that were missed in the Sorce study (Fig. 5a-d).

      We agree however with the reviewer that a comparison of our data with RiboTrap data does not take non-transcribed RNAs into account. We have refrained from such a comparison for several reasons:

      We agree with the reviewer that a systematic comparison of transcriptomes and translatomes in the assessed cell types at every time point would have allowed us to identify genes regulated on a post-transcriptional level. The goal of this study was however to identify biologically relevant prion-induced molecular changes in a cell-type specific manner rather than identify post-transcriptional regulation. To assess the validity of our approach we chose closely related datasets (RiboTrap datasets) to compare our data to. The inclusion of RNAseq datasets from FACS-isolated cells would require an additional 2 years of work since all samples and datasets would need to be newly generated (breeding mice, inoculating mice with prions and waiting for up to 8 months for mice to reach the terminal time point, establishing procedures, generating and analyzing datasets) RNA-Seq from FACS isolated neurons is problematic due to neuronal processes often being lost during the dissociation/isolation procedures. Additionally, dissociation/isolation procedures typically introduce stress-related artefacts. These procedure-induced changes complicate comparisons with techniques that have been optimized to avoid such artefacts (including the method applied in this manuscript). Differences between transcriptional and translational datasets could thus be either due to post-transcriptional regulation or due to artefact differences and are likely difficult to interpret.

      **Additional suggestions:**

      1) In Figure 1d the authors point out occasional neuronal cells exhibiting Rpl10a-GFP expression with arrows. It appears that these arrows may have moved during figure preparation - please check/fix if necessary.

      Thank you for pointing this out. We have fixed the arrows.

      2) In Supplementary Figure 1b and c it appears that the PV labeling is missing in the panel for Rpl10a:GFP controls. If this is intentional please indicate this in the figure legend.

      A co-localization of GFP-positive cells and PV was assessed only in Cre-positive (GFP expressing) mice but not in Cre-negative mice that don’t express GFP. We have clarified this point in the corresponding figure legend.

      3) It appears that the authors sequenced a significant number of libraries generated for multiple time points post-inoculation. From the figures and legends it was not entirely clear to me, how many replicates were analyzed given that in some analyses samples from different time points were combined in a single plot.

      All analyzed samples are listed in Supplementary File 1. We have emphasized this pointed in the results section.

      4) It was unclear to me how long after inoculation the group of "terminally ill" mice were sacrificed. Somewhere in the text it states that there are 2 months between 24 wpi and terminally ill - but it appears that this was not a preset timepoint but varied from animal to animal based on symptoms. Please clarify.

      We sacrifice mice at the last humane time point possible at which they show terminal disease symptoms, including piloerection, hind limb clasping, kyphosis and ataxia. Intraperitoneal inoculated mice reach that time point at 31 - 32 weeks post inoculation (+/- few days). Control mice (inoculated with non-infectious brain homogenate) were sacrificed at the same time. We have clarified this point in the methods section.

      5) From the Western blot data in Figure 6f the authors conclude that GFAP expression is upregulated in PrD mice whereas astrocyte number is unchanged. Given that the translatome is assessed based on a Rpl10-GFP dependent on recombination mediated by cre driven from GFAP promoter it is possible that the astrocytic alterations in ribosome footprints are in part a secondary consequence of increased Rpl10-GFP recombination/ expression in PrD mice (due to activation of the GFAP promoter). To estimate the impact of such an effect the authors should compare GFP levels in terminally ill control and PrD mice by western blotting.

      We agree with the reviewer that this information would be important to add. We have therefore assessed GFP levels in Rpl10a:GFP mice bred with GFAPCre and Cx3cr1CreER mice. The corresponding western blots are included in Supplementary Figure 11. GFP levels remained constant in terminally ill GFAPCre mice. This is not surprising since even a low GFAP promoter activity is likely to allow sufficient Cre recombinase expression to remove a STOP cassette allowing GFP expression (controlled by the Rosa26 promoter) in GFAPCre mice. In contrast, we observed an increase in GFP expression in terminally Cx3cr1CreER mice, which is most likely linked to the increase in microglia numbers. As pointed out in the manuscript, the translational changes we identified cannot reflect differences in cell numbers due to the nature of our assay. This suggests that a difference in GFP expression does not impact our analyses.

      We have added this data to the manuscript.

      6) The western blot analysis of fig 6f-g has been performed using a normalization over calnexin, yet no calnexin signals shown to support this statement.

      We have included blots of the normalization control calnexin as Supplementary Figure 11a.

      7) Clarify the percentage of non-parenchimal machrophages that are accounting for the Cx3cr1-creER mouse line since the authors consider this only to be a minor contamination.

      The labeling of non-parenchymal macrophages using Cx3cr1CreER mice has previously been estimated to be ~1% (Haimon et al., Nat. Immunol. 2018). We have added this information to the manuscript.

      8) Regarding the presentation of the data, Fig 5a would be clearer if in the y axes, for each cell type the order of PrD and Ctrl samples was maintained.

      Fig 5a displays hierarchical clustering based on Euclidian distances. As samples are ordered according their distance from each other, we cannot change the order as suggested by the reviewer.

      Reviewer #1 (Significance (Required)):

      Overall, this is an important and interesting study. Besides its insights into the biology, the transcriptomic data will provide a valuable resource for researchers in the field.

      Previous studies employed bulk RNAseq or microdissection for mapping transcriptomic changes (Majer et al.2019; Sorce et al. 2020 and others). The Sorce et al study concluded that astrocytic alterations in the transcriptome are more dominant than neuronal gene expression changes. While the conclusion of the present study remains the same, it is the first to use of ribosome profiling to dissect actively translated transcripts over the progression of the pathology in the mouse model. Thus, the data presented here would allow for identifying cell type-specific alterations as well as alterations specifically in mRNA translation which would be missed by bulk RNA-Seq and RNA-Seq on FACS-isolated cells. However, the authors do not fully capitalize on this strength, given that no detailed comparisons are done to a real transcriptome reference are performed (see above).

      This work is of broad interest to scientists in neurodegeneration as well as glial biology.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      Using a series of Cre-driven mouse strains a GFP-tagged version of RPL10a (a ribosomal protein) was targeted to different cell types allowing Dr Scheckel and colleagues to investigate translational changes as prion disease progresses in mice. Their data suggest massive changes in microglia and astrocytes but not neurons. The approach was particularly powerful as ribosome IP has been combined with ribosome profiling. The manuscript is very well written. What might help, however, is to make the figures more accessible (perhaps change some of the labelling?)

      I have only minor comments regarding some of the figures:

      Fig 1a: This scheme could be improved, adding wpi and better aligning the cell-types in relation to the time when the cell-types were analysed.

      We have replaced weeks with wpi and changed the alignment of cell types to clarify that all cell types were analyzed at every time point.

      Fig 1b-e: The resolution could be improved to better discern the different cell-types.

      We submitted low-quality figures due to an upload limit but will submit final figures of higher quality. Additionally, we have added higher magnification pictures to better discern the different cell types as Supplementary Fig. 1d-e.

      Fig 4: Astrocytes are categorised into A1 and A2 and microglia based on DAM and homeostatic signature (How does this relate to the M1 and M2 classification?).

      The categorization of microglia into homeostatic and disease-associated (as well as other) microglia has largely replaced the initial categorization into pro-inflammatory M1 and anti-inflammatory M2 microglia (Dubbelaar et al., Front Immunol. 2018), We have therefore opted for the more current categorization. This explanation has also been added to the manuscript.

      Reviewer #2 (Significance (Required)):

      Highly significant. I have published on de novo protein synthesis in neurodegenerative disease

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      **Summary:**

      The authors sampled actively translated proteins by cell type in the brains of RiboTag expressing mice under the control of cell specific cre recombination to determine changes in the translational profiles. They injected prions IP to induce prion disease. Their model shows little to no neuron loss at the terminal stage due to animal welfare regulations, but neuronal loss is a key hallmark of prion disease, along with gliosis. However, since other groups under different animal welfare regulations have shown that prion injection is sufficient to fully model the disease given enough time, there is sufficient evidence that this model captures early disease pathogenesis. The methodology used here has some clear advantages over previous cell-type isolation methods that require more lengthy sorting procedures. However, proteins with a long half-life or tightly regulated levels (such as TDP-43) are likely underrepresented by this method. The method also depends strongly on the specificity of the cre driver used; CamkIIa (excitatory N), parvalbumin (inhibitory N), GFAP (A), Cx3cr1 (microglia). While there is some off-target expression of the GFAP and Cx3cr1, the overall expression profiles generally match cell-specific transcriptomes obtained by other groups using other methods. They find major changes in astrocytes and microglia at terminal stages, after the onset of neurological symptoms, and comparatively fewer in neurons. Oligodendrocytes are not examined. The authors are commended on a thorough and well-designed study, especially in the comparison of multiple neuronal and glial types simultaneously.

      **Major comments:**

      Key conclusion 1: "Our results suggest that aberrant translation within glia may suffice to cause severe neurological symptoms and may even be the primary driver of prion disease." This conclusion is well-supported, serving as a hypothesis for future work. The data shows that the most abundant PTG changes are indeed in microglia at 24 wpi, before the onset of symptoms. In addition, although some genes are also differentially translated in the neuronal populations, examination of the Supplemental Tables shows that these are mostly highly expressed glial genes and could represent contamination of the sample during gliosis. The authors may wish to discuss this more prominently to avoid confusion. This data indeed suggests that glial changes alone are could be sufficient to produce the neurological symptoms in these mice. However, the authors should include discussion that the two genes changed at 24 weeks in PV neurons (Oprm1, Cyp2s1) do appear to be neuronal and may be relevant to pathogenesis as well. These mRNAs were also decreased in their previous paper conducting bulk sequencing in the hippocampus, according to the authors' online Prion RNAseq Database. Knockout experiments in mouse models have shown that dysregulation of one or a few critical genes in neurons can be sufficient to induce dysfunction and neurological symptoms, and the current evidence does not seem sufficient to rule it out. Fig 3d also suggests that PTGs in PV neurons may be particularly important, even accounting for the additional regions present in the RP analysis.

      We agree with the reviewer that few critical neuronal genes might be sufficient to induce neurological dysfunction and symptoms and have added this point to the results and discussion. Additionally, we have highlighted that many neuronal genes are glia-enriched and might reflect glia contamination.

      Key Conclusion 2: "Cell-type specific changes become only evident at late PrD stages." This conclusion is well supported. However, as the authors noted, due to legal constraints their model represents early to mid disease onset rather than a true terminal environment matching that of patients. Therefore, it would be advantageous to choose a more appropriate name for the "terminal" group, perhaps based on one of the key humane endpoint criteria that would help readers in the field to place these important results in context of the overall disease process.

      We have added additional information to clarify our definition of terminal stage to the methods.

      Key Conclusion 3: "This suggests that the prion-induced molecular phenotypes reflect major glia alterations, whereas the neuronal changes responsible for the behavioral phenotypes may be ascribed to biochemically undetectable changes such as altered neuronal connectivity." The authors should modify the second half of this claim. As discussed above, changes to even a few neuronal genes can be sufficient to induce neurodegeneration. The claim that "the neuronal changes responsible for the behavioral phenotypes may be ascribed to biochemically undetectable changes," fails to acknowledge the changes in PV neurons observed in this study, however few they may be. The authors also do not take into account the possible role of transcribed RNAs that are not immediately translated (for example those that accumulate at synapses for fast translation on demand) or the overall proteome, which are not included in their analysis. Though their method cannot detect these components, the authors should examine the implications that such other changes may still be present in the discussion. The authors should also discuss the functions of the few specific PV PTGs and explore their potential relationship with neurodegeneration. This is especially important since the authors acknowledge that a key reason for including PV neurons in the analysis is ample evidence in the literature that they play a role in disease pathogenesis. Finally, the authors note that a top GO term in microglial cells was synaptic transmission. The authors should expand on this finding in the discussion, as the interplay of glia and neurons in the pathogenesis of disease is likely highly relevant.

      We have removed the claim that “behavioral phenotypes may be ascribed to biochemically undetectable changes” and added the point that few neuronal changes might be sufficient to induce neuronal dysfunction & symptoms. As stated in the manuscript, we believe that the enrichment of the GO term synaptic transmission in microglia is an artefact. We therefore refrained from further discussing this finding and have highlighted that it is in artefact in the results.

      • *Would additional experiments be essential to support the claims of the paper? Request additional experiments only where necessary for the paper as it is, and do not ask authors to open new lines of experimentation.* - *Are the suggested experiments realistic in terms of time and resources? It would help if you could add an estimated cost and time investment for substantial experiments.*

      As discussed above, the inclusion of RNAseq datasets from FACS isolated cells would require an additional 2 years of work since all samples and datasets would need to be newly generated (breeding mice, inoculating mice with prions and waiting for up to 8 months for mice to reach the terminal time point, establishing procedures, generating and analyzing datasets).

      Key Conclusion 1: No additional experiments needed. Key Conclusion 2: No additional experiments needed. Key Conclusion 3: No additional experiments needed for a modified statement.

      The data and methods are largely reproducible. Additional information should be provided about the methods for Gene Ontology analysis, how it was controlled, and what was used as a significance measure.

      We have added additional information about the GO analysis to the methods section. The complete list of GO terms is now included as Supplementary File 10.

      Some groups contain only two animals. At least three should be included per group for a minimally robust analysis.

      We have tried to include 3 replicates per group as suggested by the reviewer. In few exceptions, we lost an individual sample and one sample had to be excluded due to low quality. In these instances (GFAP_2wpi Ctrl; CamKIIa_CX_term_Ctrl, CamKIIa_CX_term_PrD, Cx3cr1_term_Ctrl and Cx3cr1_term_PrD) we ensured that both replicates showed a high correlation and could still yield reliable results (see below). Consistently, the DESeq2 algorithm (which can handle also just 2 replicates per group) identified differentially translated genes in the terminal samples.

      **Minor comments:**

      Fig. 1 c-e all panels should have a scale bar. E, closer insets or larger images are needed to see the colocalization in these very small cells.

      We have added scale bars to all panels. A colocalization is indeed not visible in the uploaded low-quality Figures that were submitted due to the size limit. We believe that a colocalization is visible in the high-quality final pictures but are also happy to provide closer insets upon editorial request.

      Fig. 5f: To allow interpretation of the Gene Ontology analysis, authors should include the number of genes involved in the pathway and the number of those genes found in their sample input list.

      We have added details regarding the GO analysis to the methods section, and are now providing the requested information in Supplementary File 10.

      Fig. S6: It is not clear from viewing the figure or the legend what the percentages on the axes refer to.

      The principal components 1 and 2 are plotted on the x and y axes, respectively. The % of variance explained by these principal components is indicated. We have added this information to the figure legend.

      Fig. S7: the gene numbers are confusing because they do not match the data in Fig. 4a. It would be helpful to use the same LFC cutoff as in Fig. 4a to avoid misunderstandings by the reader, or explain why no cutoff is used and what information the authors wish to convey by presenting the data that way.

      *Typically, all significant changes (p adj Fig S9: The legend indicates that genes changed in all 5 datasets are colored in green, however this is not easily visible on the graphs (appears more gray).

      Genes changing in all datasets are colored in green in Fig. 5. Genes changing in all datasets are colored in grey in Supplementary Fig. 9. We have adjusted the corresponding legends. The quality of the figures is very low due to the upload limit. The final figures will be of higher quality.

      Fig. S10: on page 12 Supplementary Fig. 10c is referenced, but likely refers to 10b. Throughout manuscript: It should be RNase, not RNAse.

      Both points have been addressed.

      Reviewer #3 (Significance (Required)):

      This work provides an important conceptual advance in prion disease research that glia may be primary drivers of disease equal to or surpassing certain neuronal populations. Though the authors have shown previously that glial changes are dominant in bulk sequencing of the hippocampus, cell type-specific analysis adds an important level of detail to convince the field that few transcriptional changes occur in neurons though neurological defects are already present. Historically, neuronal defects have been assumed to occupy the main role, with glia being largely ignored. This echoes recent similar changes in other areas of the neurodegenerative disease field where we are recognizing the important roles of glia in pathogenesis, and how they may be modulated to treat disease.

      Their findings in PV neurons also may reflect early key changes in this important neuronal population that contribute to neurological symptom onset. They will allow further study of the genes and pathways involved and may lead to additional effective treatments for disease. Finally, the thorough comparison of multiple neuronal and glial populations will allow future investigation of the interplay of neurons and microglia in pathogenesis and shows the importance of studying them synergistically rather than individually.

      *Audience:*

      The neurodegenerative disease field in general will be interested in the findings. Immunologists, other neuroscientists, and pharmaceutical and other drug development organizations will also be influenced by the work.

      *Own expertise:*

      Neurodegenerative disease, transgenic mouse models, neuropathology, translational neuroscience

      REFEREE'S CROSS-COMMENTING:

      I agree with Reviewer 1 that a comparison of the total transcriptome with ribosomally active transcripts would aid the interpretation of this work. It would also uncover or refute the presence of cell-type differences in translation efficiency that directly impact the authors' major conclusion that glia are more affected than neurons. I support the request of this additional experiment.

      As discussed above we have refrained from such a comparison since 1) the scope of this study was to identify biologically relevant prion-induced molecular changes and not study post-transcriptional regulation, 2) the generation of such dataset will take ~ 2 years, and 3) difference between transcriptional and translational changes are likely a combination of post-transcriptional regulation and artefact induced change that are probably difficult to interpret.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Summary:

      The authors sampled actively translated proteins by cell type in the brains of RiboTag expressing mice under the control of cell specific cre recombination to determine changes in the translational profiles. They injected prions IP to induce prion disease. Their model shows little to no neuron loss at the terminal stage due to animal welfare regulations, but neuronal loss is a key hallmark of prion disease, along with gliosis. However, since other groups under different animal welfare regulations have shown that prion injection is sufficient to fully model the disease given enough time, there is sufficient evidence that this model captures early disease pathogenesis. The methodology used here has some clear advantages over previous cell-type isolation methods that require more lengthy sorting procedures. However, proteins with a long half-life or tightly regulated levels (such as TDP-43) are likely underrepresented by this method. The method also depends strongly on the specificity of the cre driver used; CamkIIa (excitatory N), parvalbumin (inhibitory N), GFAP (A), Cx3cr1 (microglia). While there is some off-target expression of the GFAP and Cx3cr1, the overall expression profiles generally match cell-specific transcriptomes obtained by other groups using other methods. They find major changes in astrocytes and microglia at terminal stages, after the onset of neurological symptoms, and comparatively fewer in neurons. Oligodendrocytes are not examined. The authors are commended on a thorough and well-designed study, especially in the comparison of multiple neuronal and glial types simultaneously.

      Major comments:

      Key conclusion 1: "Our results suggest that aberrant translation within glia may suffice to cause severe neurological symptoms and may even be the primary driver of prion disease." This conclusion is well-supported, serving as a hypothesis for future work. The data shows that the most abundant PTG changes are indeed in microglia at 24 wpi, before the onset of symptoms. In addition, although some genes are also differentially translated in the neuronal populations, examination of the Supplemental Tables shows that these are mostly highly expressed glial genes and could represent contamination of the sample during gliosis. The authors may wish to discuss this more prominently to avoid confusion. This data indeed suggests that glial changes alone are could be sufficient to produce the neurological symptoms in these mice. However, the authors should include discussion that the two genes changed at 24 weeks in PV neurons (Oprm1, Cyp2s1) do appear to be neuronal and may be relevant to pathogenesis as well. These mRNAs were also decreased in their previous paper conducting bulk sequencing in the hippocampus, according to the authors' online Prion RNAseq Database. Knockout experiments in mouse models have shown that dysregulation of one or a few critical genes in neurons can be sufficient to induce dysfunction and neurological symptoms, and the current evidence does not seem sufficient to rule it out. Fig 3d also suggests that PTGs in PV neurons may be particularly important, even accounting for the additional regions present in the RP analysis.

      Key Conclusion 2: "Cell-type specific changes become only evident at late PrD stages." This conclusion is well supported. However, as the authors noted, due to legal constraints their model represents early to mid disease onset rather than a true terminal environment matching that of patients. Therefore, it would be advantageous to choose a more appropriate name for the "terminal" group, perhaps based on one of the key humane endpoint criteria that would help readers in the field to place these important results in context of the overall disease process.

      Key Conclusion 3: "This suggests that the prion-induced molecular phenotypes reflect major glia alterations, whereas the neuronal changes responsible for the behavioral phenotypes may be ascribed to biochemically undetectable changes such as altered neuronal connectivity." The authors should modify the second half of this claim. As discussed above, changes to even a few neuronal genes can be sufficient to induce neurodegeneration. The claim that "the neuronal changes responsible for the behavioral phenotypes may be ascribed to biochemically undetectable changes," fails to acknowledge the changes in PV neurons observed in this study, however few they may be. The authors also do not take into account the possible role of transcribed RNAs that are not immediately translated (for example those that accumulate at synapses for fast translation on demand) or the overall proteome, which are not included in their analysis. Though their method cannot detect these components, the authors should examine the implications that such other changes may still be present in the discussion. The authors should also discuss the functions of the few specific PV PTGs and explore their potential relationship with neurodegeneration. This is especially important since the authors acknowledge that a key reason for including PV neurons in the analysis is ample evidence in the literature that they play a role in disease pathogenesis. Finally, the authors note that a top GO term in microglial cells was synaptic transmission. The authors should expand on this finding in the discussion, as the interplay of glia and neurons in the pathogenesis of disease is likely highly relevant.

      • Would additional experiments be essential to support the claims of the paper? Request additional experiments only where necessary for the paper as it is, and do not ask authors to open new lines of experimentation. - Are the suggested experiments realistic in terms of time and resources? It would help if you could add an estimated cost and time investment for substantial experiments.

      Key Conclusion 1: No additional experiments needed. Key Conclusion 2: No additional experiments needed. Key Conclusion 3: No additional experiments needed for a modified statement.

      The data and methods are largely reproducible. Additional information should be provided about the methods for Gene Ontology analysis, how it was controlled, and what was used as a significance measure. Some groups contain only two animals. At least three should be included per group for a minimally robust analysis.

      Minor comments:

      Fig. 1 c-e all panels should have a scale bar. E, closer insets or larger images are needed to see the colocalization in these very small cells. Fig. 5f: To allow interpretation of the Gene Ontology analysis, authors should include the number of genes involved in the pathway and the number of those genes found in their sample input list. Fig. S6: It is not clear from viewing the figure or the legend what the percentages on the axes refer to. Fig. S7: the gene numbers are confusing because they do not match the data in Fig. 4a. It would be helpful to use the same LFC cutoff as in Fig. 4a to avoid misunderstandings by the reader, or explain why no cutoff is used and what information the authors wish to convey by presenting the data that way. Fig S9: The legend indicates that genes changed in all 5 datasets are colored in green, however this is not easily visible on the graphs (appears more gray). Fig. S10: on page 12 Supplementary Fig. 10c is referenced, but likely refers to 10b. Throughout manuscript: It should be RNase, not RNAse.

      Significance

      This work provides an important conceptual advance in prion disease research that glia may be primary drivers of disease equal to or surpassing certain neuronal populations. Though the authors have shown previously that glial changes are dominant in bulk sequencing of the hippocampus, cell type-specific analysis adds an important level of detail to convince the field that few transcriptional changes occur in neurons though neurological defects are already present. Historically, neuronal defects have been assumed to occupy the main role, with glia being largely ignored. This echoes recent similar changes in other areas of the neurodegenerative disease field where we are recognizing the important roles of glia in pathogenesis, and how they may be modulated to treat disease.

      Their findings in PV neurons also may reflect early key changes in this important neuronal population that contribute to neurological symptom onset. They will allow further study of the genes and pathways involved and may lead to additional effective treatments for disease. Finally, the thorough comparison of multiple neuronal and glial populations will allow future investigation of the interplay of neurons and microglia in pathogenesis and shows the importance of studying them synergistically rather than individually.

      Audience:

      The neurodegenerative disease field in general will be interested in the findings. Immunologists, other neuroscientists, and pharmaceutical and other drug development organizations will also be influenced by the work.

      Own expertise:

      Neurodegenerative disease, transgenic mouse models, neuropathology, translational neuroscience

      REFEREE'S CROSS-COMMENTING:

      I agree with Reviewer 1 that a comparison of the total transcriptome with ribosomally active transcripts would aid the interpretation of this work. It would also uncover or refute the presence of cell-type differences in translation efficiency that directly impact the authors' major conclusion that glia are more affected than neurons. I support the request of this additional experiment.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Using a series of Cre-driven mouse strains a GFP-tagged version of RPL10a (a ribosomal protein) was targeted to different cell types allowing Dr Scheckel and colleagues to investigate translational changes as prion disease progresses in mice. Their data suggest massive changes in microglia and astrocytes but not neurons. The approach was particularly powerful as ribosome IP has been combined with ribosome profiling. The manuscript is very well written. What might help, however, is to make the figures more accessible (perhaps change some of the labelling?)

      I have only minor comments regarding some of the figures:

      Fig 1a: This scheme could be improved, adding wpi and better aligning the cell-types in relation to the time when the cell-types were analysed. Fig 1b-e: The resolution could be improved to better discern the different cell-types. Fig 4: Astrocytes are categorised into A1 and A2 and microglia based on DAM and homeostatic signature (How does this relate to the M1 and M2 classification?).

      Significance

      Highly significant. I have published on de novo protein synthesis in neurodegenerative disease

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Scheckel et al. report a large dataset on cell type-specific translational profiling of PrD-associated molecular alterations in the a mouse model thorough RiboTRAP and ribosome profiling approaches. They report a more severe alteration in the translatome specifically in astrocyte and microglia as compared to neuronal populations. This highlights that changes in these two cell classes might have a predominant role in the pathology of PrD.

      Data and the methods are presented such that they can be reproduced. The data analysis section of the manuscript could be further elaborated. In particular, it could be clarified which / how comparisons with existing dataset have been performed. Statistical analysis description is sometimes missing (e.g. fig 6e, not clear what the stars on top of the bars stands for, which test was performed and the significance). Moreover, the section of the methods regarding the western blots presented in figure 6 appear to be missing.

      Major concern:

      The most important improvement the authors should consider for their paper is to more specifically attempt to isolate specific effects on translational efficiency of mRNAs. As it stands, the authors largely use RiboTrap data as a reference to compare their footprinting data - but arguably, this misses mRNAs that are present in the transcriptome and not efficiently recruited onto ribosomes. It appears to be somewhat a lost opportunity to not attempt to test in the dataset (possibly by comparison to RNA-Seq from FACS isolated cells as a reference) whether there is a systematic change in translational efficiency (possibly in mRNAs with specific features?). In the current form, the RiboTrap and footprinting approaches largely serve to isolate mRNAs from cre-defined cell types but given the lack of a "total transcriptome" reference from the respective cells, it can not be easily interpreted whether certain transcripts are heavily regulated at the level of translation. Thus, despite using much more advanced methodologies than the Sorce study, the fundamental conclusions emerging from this work are rather similar to this previously published piece of work.

      Additional suggestions:

      1) In Figure 1d the authors point out occasional neuronal cells exhibiting Rpl10a-GFP expression with arrows. It appears that these arrows may have moved during figure preparation - please check/fix if necessary.

      2) In Supplementary Figure 1b and c it appears that the PV labeling is missing in the panel for Rpl10a:GFP controls. If this is intentional please indicate this in the figure legend.

      3) It appears that the authors sequenced a significant number of libraries generated for multiple time points post-inoculation. From the figures and legends it was not entirely clear to me, how many replicates were analyzed given that in some analyses samples from different time points were combined in a single plot.

      4) It was unclear to me how long after inoculation the group of "terminally ill" mice were sacrificed. Somewhere in the text it states that there are 2 months between 24 wpi and terminally ill - but it appears that this was not a preset timepoint but varied from animal to animal based on symptoms. Please clarify.

      5) From the Western blot data in Figure 6f the authors conclude that GFAP expression is upregulated in PrD mice whereas astrocyte number is unchanged. Given that the translatome is assessed based on a Rpl10-GFP dependent on recombination mediated by cre driven from GFAP promoter it is possible that the astrocytic alterations in ribosome footprints are in part a secondary consequence of increased Rpl10-GFP recombination/ expression in PrD mice (due to activation of the GFAP promoter). To estimate the impact of such an effect the authors should compare GFP levels in terminally ill control and PrD mice by western blotting.

      6) The western blot analysis of fig 6f-g has been performed using a normalization over calnexin, yet no calnexin signalis shown to support this statement.

      7) Clarify the percentage of non-parenchimal machrophages that are accounting for the Cx3cr1-creER mouse line since the authors consider this only to be a minor contamination.

      8) Regarding the presentation of the data, Fig 5a would be clearer if in the y axes, for each cell type the order of PrD and Ctrl samples was maintained.

      Significance

      Overall, this is an important and interesting study. Besides its insights into the biology, the transcriptomic data will provide a valuable resource for researchers in the field.

      Previous studies employed bulk RNAseq or microdissection for mapping transcriptomic changes (Majer et al.2019; Sorce et al. 2020 and others). The Sorce et al study concluded that astrocytic alterations in the transcriptome are more dominant than neuronal gene expression changes. While the conclusion of the present study remains the same, it is the first to use of ribosome profiling to dissect actively translated transcripts over the progression of the pathology in the mouse model. Thus, the data presented here would allow for identifying cell type-specific alterations as well as alterations specifically in mRNA translation which would be missed by bulk RNA-Seq and RNA-Seq on FACS-isolated cells. However, the authors do not fully capitalize on this strength, given that no detailed comparisons are done to a real transcriptome reference are performed (see above).

      This work is of broad interest to scientists in neurodegeneration as well as glial biology.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer 1

      __*Review 1 Summary:

      __In this manuscript, Borah et al showed that Heh2, a component of INM, can be co-purified with a specific subset of nucleoporins. They also found that disrupting interactions between Heh2 and NPC causes NPC clustering. Lastly, they showed that the knockout of Nup133, which does not physically interact with Heh2, causes the dissociation of Heh2 from NPCs. These findings led the authors to propose that Heh2 acts as a sensor of NPC assembly state. *

      __Reviewer 1 major comment 1:__ The authors claimed that Heh2 acts as a sensor of NPC assembly state, as evidenced by their finding that Heh2 fails to bind with NPCs in nup133 Δ cells (Fig2, Fig 5). However, there is a possibility that the association between Heh2 and NPCs is merely affected by the clustering of the NPCs (as the authors discussed) but not related to the structural integrity of NPC.

      • *

      Our Response: We agree that this is a possibility, however, we ask the reviewer to also consider that we artificially cluster NPCs using the anchor away system (Figure 3C) and this does not affect Heh2’s association with NPCs. Thus, clustering per se is insufficient to disrupt Heh2 binding to NPCs. We will also make changes in the text to make this point.

      • *

      Reviewer 1 major comment 2: In addition, their data showing that the Heh2-NPCs association is not easily disrupted by knocking out the individual components of the IRC (Fig. 5A and 5D), also disfavor the idea that Heh2 could sense NPC assembly state.

      Our Response: There are three considerations here. The first is that as this is the first evidence of any kind of “NPC assembly state” sensor, it is difficult to make any assumptions as to what specifically such a sensor would be monitoring. i.e. perhaps sensing only the ORC is what is functionally important. Second, for obvious reasons, we only tested non-essential IRC nups so by definition there is inherent functional redundancy that maintains NPC function and thus there may be no need to “sense” anything in the absence of these IRC nups. Further (and last), the IRC is essential for NPC assembly. Thus, without an IRC there is no NPC assembly state to sense.

      Reviewer 1 major comment 3: Since some nup knockout strains, other than nup133 Δ, are also known to show the NPC clustering (ex. nup159 (Gorsch JCB 1995) and nup120 (Aitchison JCB 1995; Heath JCB 1995)), it will be worth trying to monitor the localization of Heh2 and its interaction with nucleoporins (by Heh2-TAP) using these strains. While Nup159 is a member of the cytoplasmic complex, Nup120 is an ORC nucleoporin. Thus, biochemical and phenotypical analysis using these mutant cells will be useful to clarify if the striking phenotypes the authors found are specific to nup133 knockout strain (or ORC Nup knockouts) or could be commonly observed in the strains that show NPC clustering. Another interesting point is that Nup159 shows strong interaction with Heh2, even in nup133Δ cells. As the authors mentioned, Nup159-Heh2 interaction may not be sufficient for Heh2-NPC association, but it could be important for NPC clustering.

      Our Response: These are excellent points and we agree that there is a need to more thoroughly explore how NPC clustering driven by abrogating the function of other nups impacts Heh2’s association with NPCs. Thus, in a revised manuscript, we would examine Heh2’s association with NPCs in several additional genetic backgrounds where NPCs cluster.

      Reviewer 1 major comment 4: Figure 4C: Is it known that rapamycin treatment in this strain did not affect the protein levels of nucleoporins? Otherwise, the authors should confirm this by western blotting (at least some of them).

      Our Response: This is a good point and we will directly address this with Western blotting of some nups.

      Reviewer 1 major comment 5: Figure 5: The authors mentioned (line 256-257) that "in all cases the punctate, NPC-like distribution of Heh2-GFP was retained (Fig 5D)". However, nup107 KO strain seems to show more diminished punctate staining as compared with other strains. To clarify this, the authors should express mCherry tagged Nup as in Fig. 2 or Fig. 3.

      Our Response: Yes, we agree and in fact this observation is consistent with the fact that there is an ER-pool of Heh2 observed in this strain and we observe loss of nup interactions in the affinity purification. We will include a more thorough quantification of this in a revised manuscript and more directly address this in the text.

      **Minor comments:**

      Reviewer 1 minor comment 1: Figure 4A and 4B: The authors should show Scatter plot as in Fig. 2 and Fig. 3.

      • *

      We will include this in a revised manuscript.

      Reviewer 1 minor comment 2: Figure 5C: Explanations of the arrowheads is missing in the figure legend.

      Thank you for pointing this out, it will be fixed in a revised manuscript.

      Reviewer 1 minor comment 3: Figure 6: Is there any information as to where Heh2 (316-663) is localized in the cell?

      As this truncation lacks INM targeting sequences, it is found throughout the cortical ER. The determinants of Heh2 targeting (including truncations) has been extensively evaluated in King et al. 2006, Meinema et al., 2011 and Rempel et al. 2020. We will make this clearer in the revised manuscript.

      Reviewer 1 minor comment 4: Figure 6B: Nucleoporins should be marked with color circles as in Fig. 1 and Fig. 5.

      This will be done.

      Reviewer 2

      Borah et al. present a biochemical and cell biological examination of the inner nuclear membrane (INM) protein Heh2 and its putative interactions with the nuclear pore complex (NPC). The potential conceptual advance of this study is that Heh2 interacts with the NPC, while mutations believed to trigger NPC mis-assembly are shown to abolish interaction with Heh2, leading to the hypothesis that Heh2 is a sensor for NPC assembly states within the (INM). The conclusions would undoubtably be of broad interest to the nucleocytoplasmic transport field, but the evidence provided thus far is insufficient to build confidence and consequently this manuscript is premature for publication.

      Our Response: We thank the reviewer for recognizing the potential for a significant conceptual advance for the field but object to the notion that the work is “premature for publication”. This is a highly subjective statement that does not seem to meet the mission or purpose of the Review Commons platform. While it is possible that some of the conclusions drawn in our manuscript might not be fully supported by the data in its current form, there is a substantial body of work here that is certainly publishable.

      Reviewer 2 major comment 1: The TAP-tag Heh1/Heh2 pulldowns are the most significant experiment presented, and on face value provide compelling evidence that Heh2 interacts with the NPC. It is stated that mass spectroscopy (MS) was used to confirm the identities of the labeled bands yet there is no methods section, nor any MS data reported in the manuscript. Given the large number of unspecified proteins observed in these gels, and the single-step pulldown methodology used, knowledge of the contaminants present may aid in elucidating how Heh2 pulls down NPC components. Consequently, within the supplementary materials, the authors must indicate which regions of the gel were excised for MS analysis and provide a table listing all of the proteins that were detected for each sample, including the number of unique/expected peptides observed. Our Response: This was a major oversight on our part and a revised manuscript will contain all relevant details with regards to the MS analysis including a more detailed description of the excised bands and the quantification of spectra derived from these bands.

      Reviewer 2 major comment 2a: The representative micrographs provided across Figures 2, 3, 4, 5 and 6 are very noisy. Particularly in the case of the mCherry labeled nucleoporins, this is both unusual and unfortunate given this is used to infer colocalization of Heh2 with the NPC.

      Our Response: These micrographs are not unusual and are in fact of respectable quality. We agree that the apparent “noise” is unfortunate, but this is simply a reality of the yeast system. We remind the reviewer that there are only ~100 to ~200 NPCs per budding yeast nucleus, which is an order of magnitude smaller than a typical mammalian cell nucleus. Further, the copy number of yeast nups per NPC is half of the mammalian cell NPC. Further, budding yeast are spherical with a cell wall that is extremely effective at scattering light; they are also highly autofluorescent (particularly in the red channel). Lastly, unlike in mammalian cells, budding yeast NPCs are mobile on the nuclear envelope. Thus, co-localization is challenging (particularly with the long exposures required to obtain good images). This is why clustering of NPCs driven by nup133**∆ cells has provided one of the key assays in the field to assess whether a given protein associates with NPCs at the level of light microscopy.

      Reviewer 2 major comment 2b: As a result it is unclear whether this experiment can be used to differentiate between NPC colocalization vs. nuclear envelope colocalization.

      Our Response: The reviewer is correct. Co-localization between Heh2-GFP and any Nup-mCherry is insufficient to assess NPC association in WT cells. In fact, as we point out in Figure 3B, at best one can expect a correlation of r = 0.48 for two well established nups. Thus, to further support the conclusion that Heh2 associates with NPCs, we established the Nsp1-FRB NPC clustering assay (Figure 3).

      Reviewer 2 major comment 2c: The authors should include negative controls for an alternative NE membrane protein that doesn't bind the NPC, which would be expected to exhibit a reduced level of colocalization with NPC proteins when compared to Heh2. For example, Heh1 would be a suitable, given the clear-cut negative pulldown data and its prior usage as a negative control in Figure 4.

      • *

      Our Response: This is included in Figure 3D.

      Reviewer 2 major comment 3a. Figure 2. The rim staining for the Nup82-mCherry in the WT background is unusually punctate, bringing into question the viability of the cells imaged.

      Our Response: As the middle cell in the panel is undergoing cell division, these cells are clearly viable. All our imaging is performed on mid-log phase cultures.

      • *

      Reviewer 2 major comment 3b. Why has ScNup82, a cytoplasmic filament component, been selected for colocalization experiments when Heh2 is proposed to interact with the inner ring complex?

      Our Response: The resolution of a conventional light microscope is, at best, 200 nm in x, y. As NPCs are 100 nm in diameter, even two NPCs side-by-side cannot be resolved. The IRC is tens of nm away from the cytoplasmic filaments thus any nup is relevant for a co-localization analysis with a light microscope.

      Reviewer 2 major comment 3c: Additionally, the experiments shown in panels A and C are not directly comparable, ScNup82 is an asymmetric cytoplasmic nucleoporin, while SpNup107 is located in the Y-shaped Nup84 nucleoporin complex and present on both faces of the NPC. This experiment should be repeated with scNup84 to match panel C, additionally a viability dot spot assay and western blot analysis of the labeled proteins should be conducted.

      Our response: These are in fact directly comparable within the limits of resolution of light microscopy as described above. Viability assays are not required here as both nups are essential and perturbation to their function would lead to inviability.

      Reviewer 2 major comment 4: Figure 3, the authors use yeast strains where proteins are tagged with FRB and FKBP12 domains, which dimerize upon the addition of rapamycin inducing NPC clusters. The authors then observe the effect this has on Heh2 NPC colocalization. However, Rapamycin may also have an effect independent from the induced dimerization event. Negative controls should be performed in strains lacking the FRB and FKBP12 tagged proteins to demonstrate that Rapamycin doesn't modify Heh2 localization independently of NPC clustering.

      Our response: This is a good point and important control that we performed in prior studies, see Colombi et al., JCB, 2013. We will be more explicit in describing that this control has been done.

      Reviewer 2 major comment 5: Figure 4. The authors provide a qualitative description of the colocalization presented, while in all other instances they calculate a Pearson correlation coefficient. This is significant because Heh2 appears to be evenly distributed within the NE of the DMSO control (panel B). Given the presented hypothesis isn't colocalization expected with Nup192? As a minimum, a Pearson correlation coefficient analysis should be conducted and added to Figure 4.

      Our response: This will be included in a revised manuscript.

      Reviewer 2 major comment 6: Figure 4. Pom152-mCherry localizes at both the NE and strongly within the cytoplasm, which is unexpected given typical rim staining phenotypes observed previously for both Pom152-YFP and Pom152-GFP strains (Katta, ..., Jaspersen et al., Genetics (2015) & Upla, ..., Fernandez-Martinez et al., Structure (2017), respectively). Given the unusually weak rim staining observed throughout, viability assays of the strains listed in Table S1 and protein expression analysis of the tagged nucleoporins via western blot is necessary.

      Our response: This is not localization in the cytoplasm but is in fact autofluorescence from the yeast vacuole. We regret we were not more explicit in describing this and we will make the manuscript more accessible for the non yeast expert. In order to perform the Western blot analysis for all strains requested by the reviewer would require a battery of antibodies to the endogenous proteins to directly assess how tagging influences nup levels, which we do not have (nor does anyone else that we are aware of). This is also not standard practice in the field as it is an onerous and unnecessary burden.

      Reviewer 2 major comment 7:* Figure 5A. The TAP-tagged pulldowns from ∆Pom152 and ∆Nup133 strains appear to be from a different round of experiments than the previous deletion strains presented. Interestingly, there appears to be an additional band at approximately 250 kDa in both cases that is not present in any other experiments. This band could be a contaminant observed due to different experimental conditions, or a protein that exclusively binds to Heh2 in the ∆Pom152 and ∆Nup133 background. Either way the authors should identify this protein with MS to address this ambiguity.

      *

      Our response: We will include negative controls for these specific experiments to show that this is a non specific band.

      Reviewer 2 major comment 8: Figure 6B. Please label the nucleoporin bands in the TAP-tagged pulldowns.

      Our response: This will be done.

      Reviewer 2 major comment 9: Figure 6D. Please specify Heh2-GFP clustering in the y-axis.

      Our response: As this represents both Heh2-GFP and heh2-1-570-GFP, we will keep it as is to avoid confusion.

      Reviewer 2 major comment 10: *Under the results section titled 'Heh2 binds to specific nups in evolutionarily distant yeasts', the authors state that spHeh2 co-purifies with "several specific species". The meaning is unclear, this sentence should be rephrased and the specific species clearly described. **

      *

      Our response: Ok.

      Reviewer 2 major comment 11: Under the results section titled 'Heh2 fails to interact with NPCs lacking Nup133', the authors refer to a Pearson correlation coefficient of -0.03 as a clear anticorrelation. Instead state there was no correlation.

      Our response: Ok.

      Reviewer 2 major comment 12: In the discussion, the authors state that "clustering itself may sterically preclude an interaction with Heh2". The text should be expanded to explain this in more detail, it is not clear from the presented data why this would occur.

      Our response: Ok.

      Reviewer 2 comment on significance: the manuscript is premature for publication.

      Our Response: Such a statement has no relevance to this form of review as a decision as to whether a study is premature for publication should be made by journal editors, not reviewers. We would argue quite strongly that we have definitively shown that Heh2 binds to NPCs, that it does so in multiple evolutionarily distant yeasts and that this binding is functionally relevant. For example, we can specifically disrupt the association of Heh2 with NPCs with a specific domain deletion and observe a loss of function phenotype (e.g. NPC clustering). What all three reviewers agree on is that the concept of a “NPC assembly state sensor” needs additional data to be fully supported, although we note that this reviewer did not provide any suggestions for how we might achieve this goal. We further note that we added the qualifier “may” into the title of the work. Thus, we will therefore perform additional experiments as outlined in comments to Reviewer 1 to support this conclusion in order to introduce this as a new concept in the field.

      Reviewer Comment from Cross Commenting: It seems to me that all reviewers agree that the manuscript is premature for publication. The data thus far do not support the conclusion that Heh2 may be an NPC assembly sensor nor does it provide any mechanistic insight. Reading the comments of the other two reviewers makes me more negative, as it is care that the paper also lacks scientific rigor. The manuscript is a great starting point for a rigorous dissection but I do not see this paper to be a candidate for a broad impact journal.

      Our Response: The statement that this manuscript is premature for publication is an opinion and does not seem to reflect the sentiment of the other reviewers. It is also confounding that this reviewer suggests that this work lacks rigor. With the exception of the omission of the MS analysis (our fault), the data are of high quality and rigorously quantified. Our assertion of rigor and data quality is based on our collective team’s many decades-long history of publishing and reviewing papers at the highest levels in this field. Questions as to the quality of the data as stated by this reviewer (and only this reviewer) in fact address limitations of light microscopy and the yeast system more generally in this one respect.


      Reviewer 3

      Reviewer 3 Summary part a*: This is quite an interesting manuscript that explores the relationship between an INM protein, Heh2, and NPCs. It represents an extension of earlier work performed by this group in which it was shown that the HEH2 gene shares genetic interactions with the genes encoding various nucleoporins. Heh2 belongs to an intriguing family of conserved proteins that includes its orthologue, Heh1, as well as human MAN1 (LEMD3) and LEMD2, among others. Each of these proteins contains two transmembrane domains with the N- and C-terminal regions extending in to the nucleoplasm. The two TM domains are separated by a short lumenal loop.

      In this study, the authors show that a population of Heh2 is associated with Nups of the NPC inner ring complex. This was demonstrated initially in pulldown experiments. The authors go on to show that when NPCs are caused to aggregate, by physical tethering employing an FKBP/FRP system in combination with Rapamycin, Heh2, but not Heh1, colocalizes with the NPC clusters. *

      • *

      Our Response: Thank you to the reviewer for recognizing the value of this work.

      • *

      Reviewer 3 Summary_b. Although not stated explicitly in the manuscript, this would imply that there is a population of Heh2 that resides in the NPC membrane domain, with the remainder in the INM. As an idle question, is there any evidence for a similar localization of MAN1 or LEMD2 in mammals? I am guessing probably not.

      Our Response: We regret this was not made more clear but the idea that there is a pool of Heh2 at the POM and a pool at the INM is an important conclusion of the work and was stated in the results - we’ll re-emphasize in the revised discussion. As to whether MAN1 or LEMD2 has a similar NPC association, we hypothesize that MAN1 but not LEMD2 will indeed interact with NPCs in mammalian cells. This is based on considering that we show that both the budding and fission yeast orthologues of MAN1 share this association so unless it was lost in evolution, this is a likely outcome of future studies.

      Reviewer 3 Significance statement a: The complications arise when the authors show that an alternative method of NPC aggregation (although they did this first), involving Nup133 deletion, results in failure of Heh2 to co-aggregate. In other words, Nup133 is required for the association of Heh2 with NPCs. The issue here is that there is no evidence for an interaction between Heh2 and Nup133, and furthermore that loss of Nup133 (a Y complex component of the outer ring complex) leaves the inner ring complex intact.

      • *

      Our Response: We tested the nup133Δ background first as this is the standard approach for assessing NPC-association of a given protein so we felt this would be logical for a reader in the field. Further, while the disruption of Heh2’s binding by loss of Nup133 may be a complication, we prefer to see it as an opportunity for discovery. As described in our manuscript, we have chosen to interpret this result in the context of a new biological function/concept with Heh2 being a novel “NPC assembly state” sensor. While one could argue that we have not fully met this bar yet, we will perform additional experiments as outlined in our response to reviewer 1 to help support this compelling conclusion.

      • *

      Reviewer 3 Signfiicance statement b: What is clear, however, is that Heh2 seems to be required to inhibit NPC aggregation since Heh2 deficient cells exhibit NPC clusters. The association between Heh2 and IRC Nups resides in the C-terminal nucleoplasmic winged helix domain. The N-terminal domain, in contrast confers INM localization.

      • *

      Our Response: We agree.__*


      Reviewer 3 Signfiicance statement c I must admit, I am in two minds about this manuscript. The data clearly show that Heh2 is associated with IRC components and I agree with the authors that this protein may well have a role in NPC assembly quality control perhaps in the guise of a chaperone. However, I find it hard to come up with a convincing model for the effects of Nup133. On the one hand, one could make an argument that the data presented here is too preliminary and fails to provide a complete story. On the other hand, it does provide an intriguing foundation for future studies and I do feel positively disposed towards it. In short, I have no fundamental complaints about the science, I am just uncertain as to whether the study is ready for publication.

      Our Response: This statement nicely articulates the challenge with this manuscript as there are some solid findings (that Heh2 binds specifically to NPCs etc.) but also a provocative finding (that loss of Nup133 breaks Heh2’s interaction with NPCs despite not physically interacting). Thus, there is a decision to be made about whether there is value in introducing a novel concept to the field once additional data is provided in a revised manuscript.

      Reviewer 3 Cross commenting: I have no fundamental disagreements with either of the other two reviewers. The comment from Reviewer#2 summarises this quite neatly. While I have fewer concerns about the quality of the data as presented, I think we all agree that at best the study is preliminary. What the authors need to do is to construct a coherent model that will account for the observations described here and then to design experiments that will test this model. I'm not suggesting that they must have a complete story, but they do need to go beyond what is in the current manuscript.

      • *

      Our Response: We appreciate that the reviewer does not have any questions about the quality of our data, but we argue that we have in fact presented the most coherent interpretation of the data as it currently stands. As described above, we intend to attempt to solidify this model by performing experiments suggested by reviewer 1.



      Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting. Reply to the Reviewers I thank the Referees for their...Referee #1__

      1. The authors should provide more information when... Responses__

      The typical domed appearance of a hydrocephalus-harboring skull is apparent as early as P4, as shown in a new side-by-side comparison of pups at that age (Fig. 1A). Though this is not stated in the MS

      1. Figure 6: Why has only... Response: We expanded the comparisonMinor comments:__

      2. The text contains several... Response: We added... Referee #2__

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      This is quite an interesting manuscript that explores the relationship between an INM protein, Heh2, and NPCs. It represents an extension of earlier work performed by this group in which it was shown that the HEH2 gene shares genetic interactions with the genes encoding various nucleoporins. Heh2 belongs to an intriguing family of conserved proteins that includes its orthologue, Heh1, as well as human MAN1 (LEMD3) and LEMD2, among others. Each of these proteins contains two transmembrane domains with the N- and C-terminal regions extending in to the nucleoplasm. The two TM domains are separated by a short lumenal loop.

      In this study, the authors show that a population of Heh2 is associated with Nups of the NPC inner ring complex. This was demonstrated initially in pulldown experiments. The authors go on to show that when NPCs are caused to aggregate, by physical tethering employing an FKBP/FRP system in combination with Rapamycin, Heh2, but not Heh1, colocalizes with the NPC clusters. Although not stated explicitly in the manuscript, this would imply that there is a population of Heh2 that resides in the NPC membrane domain, with the remainder in the INM. As an idle question, is there any evidence for a similar localization of MAN1 or LEMD2 in mammals? I am guessing probably not.

      Significance

      The complications arise when the authors show that an alternative method of NPC aggregation (although they did this first), involving Nup133 deletion, results in failure of Heh2 to co-aggregate. In other words, Nup133 is required for the association of Heh2 with NPCs. The issue here is that there is no evidence for an interaction between Heh2 and Nup133, and furthermore that loss of Nup133 (a Y complex component of the outer ring complex) leaves the inner ring complex intact. What is clear, however, is that Heh2 seems to be required to inhibit NPC aggregation since Heh2 deficient cells exhibit NPC clusters. The association between Heh2 and IRC Nups resides in the C-terminal nucleoplasmic winged helix domain. The N-terminal domain, in contrast confers INM localization.

      I must admit, I am in two minds about this manuscript. The data clearly show that Heh2 is associated with IRC components and I agree with the authors that this protein may well have a role in NPC assembly quality control perhaps in the guise of a chaperone. However, I find it hard to come up with a convincing model for the effects of Nup133. On the one hand, one could make an argument that the data presented here is too preliminary and fails to provide a complete story. On the other hand, it does provide an intriguing foundation for future studies and I do feel positively disposed towards it. In short, I have no fundamental complaints about the science, I am just uncertain as to whether the study is ready for publication.

      REFEREES CROSS COMMENTING

      I have no fundamental disagreements with either of the other two reviewers. The comment from Reviewer#2 summarises this quite neatly. While I have fewer concerns about the quality of the data as presented, I think we all agree that at best the study is preliminary. What the authors need to do is to construct a coherent model that will account for the observations described here and then to design experiments that will test this model. I'm not suggesting that they must have a complete story, but they do need to go beyond what is in the current manuscript.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Borah et al. present a biochemical and cell biological examination of the inner nuclear membrane (INM) protein Heh2 and its putative interactions with the nuclear pore complex (NPC). The potential conceptual advance of this study is that Heh2 interacts with the NPC, while mutations believed to trigger NPC mis-assembly are shown to abolish interaction with Heh2, leading to the hypothesis that Heh2 is a sensor for NPC assembly states within the (INM). The conclusions would undoubtably be of broad interest to the nucleocytoplasmic transport field, but the evidence provided thus far is insufficient to build confidence and consequently this manuscript is premature for publication.

      Specific comments:

      (1)The TAP-tag Heh1/Heh2 pulldowns are the most significant experiment presented, and on face value provide compelling evidence that Heh2 interacts with the NPC. It is stated that mass spectroscopy (MS) was used to confirm the identities of the labeled bands yet there is no methods section, nor any MS data reported in the manuscript. Given the large number of unspecified proteins observed in these gels, and the single-step pulldown methodology used, knowledge of the contaminants present may aid in elucidating how Heh2 pulls down NPC components. Consequently, within the supplementary materials, the authors must indicate which regions of the gel were excised for MS analysis and provide a table listing all of the proteins that were detected for each sample, including the number of unique/expected peptides observed.

      (2)The representative micrographs provided across Figures 2, 3, 4, 5 and 6 are very noisy. Particularly in the case of the mCherry labeled nucleoporins, this is both unusual and unfortunate given this is used to infer colocalization of Heh2 with the NPC. As a result it is unclear whether this experiment can be used to differentiate between NPC colocalization vs. nuclear envelope colocalization. The authors should include negative controls for an alternative NE membrane protein that doesn't bind the NPC, which would be expected to exhibit a reduced level of colocalization with NPC proteins when compared to Heh2. For example, Heh1 would be a suitable, given the clear-cut negative pulldown data and its prior usage as a negative control in Figure 4.

      (3)Figure 2. The rim staining for the Nup82-mCherry in the WT background is unusually punctate, bringing into question the viability of the cells imaged. Why has ScNup82, a cytoplasmic filament component, been selected for colocalization experiments when Heh2 is proposed to interact with the inner ring complex? Additionally, the experiments shown in panels A and C are not directly comparable, ScNup82 is an asymmetric cytoplasmic nucleoporin, while SpNup107 is located in the Y-shaped Nup84 nucleoporin complex and present on both faces of the NPC. This experiment should be repeated with scNup84 to match panel C, additionally a viability dot spot assay and western blot analysis of the labeled proteins should be conducted.

      (4)Figure 3, the authors use yeast strains where proteins are tagged with FRB and FKBP12 domains, which dimerize upon the addition of rapamycin inducing NPC clusters. The authors then observe the effect this has on Heh2 NPC colocalization. However, Rapamycin may also have an effect independent from the induced dimerization event. Negative controls should be performed in strains lacking the FRB and FKBP12 tagged proteins to demonstrate that Rapamycin doesn't modify Heh2 localization independently of NPC clustering.

      (5)Figure 4. The authors provide a qualitative description of the colocalization presented, while in all other instances they calculate a Pearson correlation coefficient. This is significant because Heh2 appears to be evenly distributed within the NE of the DMSO control (panel B). Given the presented hypothesis isn't colocalization expected with Nup192? As a minimum, a Pearson correlation coefficient analysis should be conducted and added to Figure 4.

      (6)Figure 4. Pom152-mCherry localizes at both the NE and strongly within the cytoplasm, which is unexpected given typical rim staining phenotypes observed previously for both Pom152-YFP and Pom152-GFP strains (Katta, ..., Jaspersen et al., Genetics (2015) & Upla, ..., Fernandez-Martinez et al., Structure (2017), respectively). Given the unusually weak rim staining observed throughout, viability assays of the strains listed in Table S1 and protein expression analysis of the tagged nucleoporins via western blot is necessary.

      (7)Figure 5A. The TAP-tagged pulldowns from ∆Pom152 and ∆Nup133 strains appear to be from a different round of experiments than the previous deletion strains presented. Interestingly, there appears to be an additional band at approximately 250 kDa in both cases that is not present in any other experiments. This band could be a contaminant observed due to different experimental conditions, or a protein that exclusively binds to Heh2 in the ∆Pom152 and ∆Nup133 background. Either way the authors should identify this protein with MS to address this ambiguity.

      (8)Figure 6B. Please label the nucleoporin bands in the TAP-tagged pulldowns.

      (9)Figure 6D. Please specify Heh2-GFP clustering in the y-axis.

      (10)Under the results section titled 'Heh2 binds to specific nups in evolutionarily distant yeasts', the authors state that spHeh2 co-purifies with "several specific species". The meaning is unclear, this sentence should be rephrased and the specific species clearly described.

      (11)Under the results section titled 'Heh2 fails to interact with NPCs lacking Nup133', the authors refer to a Pearson correlation coefficient of -0.03 as a clear anticorrelation. Instead state there was no correlation.

      (12)In the discussion, the authors state that "clustering itself may sterically preclude an interaction with Heh2". The text should be expanded to explain this in more detail, it is not clear from the presented data why this would occur.

      Significance

      the manuscript is premature for publication.

      REFEREES CROSS COMMENTING

      It seems to me that all reviewers agree that the manuscript is premature for publication. The data thus far do not support the conclusion that Heh2 may be an NPC assembly sensor nor does it provide any mechanistic insight. Reading the comments of the other two reviewers makes me more negative, as it is care that the paper also lacks scientific rigor. The manuscript is a great starting point for a rigorous dissection but I do not see this paper to be a candidate for a broad impact journal.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary:

      In this manuscript, Borah et al showed that Heh2, a component of INM, can be co-purified with a specific subset of nucleoporins. They also found that disrupting interactions between Heh2 and NPC causes NPC clustering. Lastly, they showed that the knockout of Nup133, which does not physically interact with Heh2, causes the dissociation of Heh2 from NPCs. These findings led the authors to propose that Heh2 acts as a sensor of NPC assembly state.

      Major comments:

      The authors claimed that Heh2 acts as a sensor of NPC assembly state, as evidenced by their finding that Heh2 fails to bind with NPCs in nup133 Δ cells (Fig2, Fig 5). However, there is a possibility that the association between Heh2 and NPCs is merely affected by the clustering of the NPCs (as the authors discussed) but not related to the structural integrity of NPC. In addition, their data showing that the Heh2-NPCs association is not easily disrupted by knocking out the individual components of the IRC (Fig. 5A and 5D), also disfavor the idea that Heh2 could sense NPC assembly state. Since some nup knockout strains, other than nup133 Δ, are also known to show the NPC clustering (ex. nup159 (Gorsch JCB 1995) and nup120 (Aitchison JCB 1995; Heath JCB 1995)), it will be worth trying to monitor the localization of Heh2 and its interaction with nucleoporins (by Heh2-TAP) using these strains. While Nup159 is a member of the cytoplasmic complex, Nup120 is an ORC nucleoporin. Thus, biochemical and phenotypical analysis using these mutant cells will be useful to clarify if the striking phenotypes the authors found are specific to nup133 knockout strain (or ORC Nup knockouts) or could be commonly observed in the strains that show NPC clustering. Another interesting point is that Nup159 shows strong interaction with Heh2, even in nup133Δ cells. As the authors mentioned, Nup159-Heh2 interaction may not be sufficient for Heh2-NPC association, but it could be important for NPC clustering.

      Figure 4C: Is it known that rapamycin treatment in this strain did not affect the protein levels of nucleoporins? Otherwise, the authors should confirm this by western blotting (at least some of them).

      Figure 5: The authors mentioned (line 256-257) that "in all cases the punctate, NPC-like distribution of Heh2-GFP was retained (Fig 5D)". However, nup107 KO strain seems to show more diminished punctate staining as compared with other strains. To clarify this, the authors should express mCherry tagged Nup as in Fig. 2 or Fig. 3.

      Minor comments:

      Figure 4A and 4B: The authors should show Scatter plot as in Fig. 2 and Fig. 3.

      Figure 5C: Explanations of the arrowheads is missing in the figure legend.

      Figure 6: Is there any information as to where Heh2 (316-663) is localized in the cell?

      Figure 6B: Nucleoporins should be marked with color circles as in Fig. 1 and Fig. 5.

      Significance

      Heh2 has been implicated in the quality control of NPC assembly, however, the molecular mechanism of how Huh2 interacts and affects NPC assembly/function remained largely unknown. The relationship between Heh2 and specific nucleoporins shown in this study is novel and interesting. While the data are overall good quality and convincing, the current manuscript still lacks the molecular mechanistic insights. In particular, it is not clear if the observed phenotypes are due to structural defects of NPC or NPC clustering.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1 (Evidence, reproducibility and clarity (Required)): The manuscript by Huh et al. reports that oxidative stress causes fragmentation of a specific tyrosine pre-tRNA, leading to two parallel outcomes. First, the fragmentation depletes the mature tRNA, causing translational repression of genes that are disproportionally rich in tyrosine codon. These genes are enriched for those involved in electron transport chain, cell cycle and growth. Second, the fragmentation generates tRNA fragments (tRFs) that bind to two known RNA binding proteins. Finally, the authors identify a nuclease that is needed for efficient formation of tyrosine tRFs. Comment 1: Th­­­­e authors should include a short diagram indicating the various known steps of pre-tRNA fragmentation (perhaps as a supplement) for general readers.

      Response: We thank the reviewer for their suggestion. Pre-tRNA fragmentation is still an unknown field but an initial introduction is best seen from pre-tRNA processing where there is a cleavage event for pre-tRNAs with an intron. This is a complex subject but a recent review from Hopper and Nostramo has done an excellent job in in describing the current field in yeast and vertebrate species (Hopper and Nostramo, Front. Genet., 2019). We have added this citation and new text in the manuscript about pre-tRNA processing for general readers to follow up on. We feel that a supplementary figure might be a bit too brief in describing the knowns and unknowns of pre-tRNA processing and fragmentation.

      Comment 2: I find the enrichment for mitochondrial electron transport chain (ETC) curious. The ETC includes several oxidoreductases, which may be rich in tyrosine as it is a common amino acid used in electron transfer. The depletion of the tyrosine tRNA from among many tRNAs under oxidative stress may not be incidental but related to an attempt by the cell to decrease oxygen consumption to avoid further oxidative damage. The authors could further mine their data to corroborate this hypothesis. For example, are the ETC genes among the targets of the RNA binding proteins targeted by tyrosine tRFs? This could potentially connect the effects of mature tRNA depletion and tRFs.

      Response: We thank the reviewer for this very interesting comment and insight, which had not occurred to us. The relationship between this response and oxidoreductase regulation could be a factor in both the tRNA and tRF modulations seen in our cells. Interestingly, we find that many oxidoreductases genes (such as the NDUF family) are bound by hnRNPA1 by CLIP. In new data, we have done stability experiments with the tRF (new Fig 7E-F) to show the regulon of hnRNPA1 is modulated with overexpression and LNA against the tRF, revealing that this tRNA fragmentation response modulates expression of certain oxidoreductase genes. However, we do not see clear and significant differences for ETC genes in particular. As hnRNPA1 is known to act as both a promoter and destabilizer of genes depending on context, it is likely that further and more detailed work will be needed to parse this hypothesis out in future studies.

      Comment 3: In figure 4A, the authors should provide the tyrosine codon content of the overlap genes and show how much it differs from a randomly selected sample.

      Response: We have identified an error in our manuscript where the overlap actually identifies 109 proteins rather than the 102 reported in the original manuscript. We apologize for this oversight. As for the overlap proteins, we plotted the downstream proteins detected in the proteome by mass spectrometry based off on Tyr-codon content. As explained in the text, the targets we tested were chosen for having higher than median levels of Tyr-codon, as seen in the histogram, and for showing some of the greatest reduction after Tyr tRNA-GUA depletion (Fig S4A). The other proteins found in the overlap will fall in a similar pattern along the histogram.

      Comment 4: Fig.6F, lower panel: the model should show pre-tRNA, as opposed to mature tRNA, because it is the former that is fragmented.

      Response: We apologize for the confusion. The model in Fig 7F was supposed to denote the pre-tRNA with the trailer and leader sequences intact initially, then lost with processing to mature tRNA. To make it clearer, we have now labeled the first species as “Pre-tRNA.”

      Reviewer #1 (Significance (Required)): This study is comprehensive and novel, and includes several orthogonal and complementary approaches to provide convincing evidence for the conclusions. The main discovery is significant because it presents an important advance in post-transcriptional control of gene expression. The process of tRF formation was previously thought not to affect the levels of mature tRNA. This study changes that understanding by describing for the first time the depletion of a specific mature tRNA as its precursor form is fragmented to generate tRFs. Finally, the authors identify DIS3L2 as a nuclease involved in fragmentation. This is also an important finding as the only other suspected nuclease, albeit with contradictory evidence, is angiogenin. Collectively, the findings of this study would be of interest to a broad group of scientists. I only have a few minor comments and suggestions (see above).

      Response: We thank the reviewer for their very positive and insightful comments and feedback.

      REFEREES CROSS-COMMENTING I have the following comments on other reviewers' critiques. Regarding the concern that the disappearance of the pre-tRNA could be a transcriptional response (reviewer 2), I think that the appearance of tRFs makes this scenario unlikely. If pre-tRNA levels decreased due to transcriptional repression, wouldn't one expect that both tRNA and the tRF levels diminish concomitantly? Reviewer 3 raises the issue of cross hybridization in Northern blots. The authors indicate that they "could not detect the other tyrosyl tRNA (tRNA Tyr AUA) in MCF10A cells by northern blot..." (page 6). Also, they gel extracted tRFs and sequenced them (figure S6B), directly identifying the fragments. I think these findings mitigate the concern of cross hybridization and clearly identify the nature of tRFs. Finally, I think that the codon-dependent reporter experiment (figure 5D) addresses many issues surrounding codon dependent vs indirect effects. In that experiment, the authors mutate 5 tyrosine codons of a reporter gene and demonstrate that the encoded protein is less susceptible to repression in response to oxidative stress.

      Response: We thank the reviewer for their tremendous insights. We are in agreement regarding the three points in the cross-comments.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)): This very interesting study from Sohail Tavazoie's lab describes the consequences of oxidative stress on the tRNA pool in human epithelial cell lines. As previously described, the authors observed that tRNA fragments were generated upon exposure of cells to ROS. In addition, the authors made the novel observation that specific mature tRNAs were also depleted under these conditions. In particular, the authors focused on tyrosyl tRNA-GUA, which was decreased ~50% after 24 hours of ROS exposure, an effect attributable to a decrease in the pre-tRNA pool. Depletion of tyrosyl tRNA resulted in reduced translation of specific mRNAs that are enriched in tyr codons and likely contributed to the anti-proliferative effects of ROS exposure. In addition, the authors demonstrated that the tRFs produced from tyr tRNA-GUA can interact with specific RNA binding proteins (SSB and hnRNPA1). The major contribution of this paper is the novel finding that stress-induced tRNA fragmentation can result in a measurable reduction of specific mature tRNAs, leading to a selective reduction in translation of mRNAs that are enriched for the corresponding codons. Previously, studies of tRNA fragmentation largely focused on the functions of the tRFs themselves and it was generally believed that the mature tRNA pool was not impacted sufficiently to reduce translation. The findings reported here therefore add a new dimension to our understanding of the cellular consequences of stress-induced tRNA cleavage. Overall, the data are of high quality, the experiments are convincing, and the conclusions are well supported. I have the following suggestions that would further strengthen the study and bolster the conclusions. Comment 1: The authors have not formally demonstrated that the reduction in pre-tRNA in H2O2-treated cells is a consequence of pre-tRNA cleavage. It is possible that reduced transcription contributes to this effect. Pulse-chase experiments with nucleotides such as EU would provide a tractable approach to demonstrate that a labelled pool of pre-tRNA is rapidly depleted upon H2O2 treatment, which would further support their model. Since the response occurs rapidly (within 1 hour), it would be feasible to monitor the rate of pre-tRNA depletion during this time period in control vs. H2O2-treated cells.

      Response: We thank the reviewer for their suggestion and agree that testing for a transcriptional effect using a pulse-chase experiment would further support these findings. We are grateful to both reviewer 1 and reviewer 2 in the cross-comments for recognizing that the tRNA repression response we see is too rapid to be a transcriptional response and that the fact that this tRNA depletion response occurs concomitantly with the tRF generation supports our model that this is a pre-tRNA fragmentation response. It would be of interest for future studies to also examine the impact of cellular stress on tRNA transcription.

      Comment 2: To what extent is the growth arrest that results from H2O2 treatment attributable to tyr tRNA-GUA depletion (Fig. 3A)? Since the reduction in tRNA levels is only partial (~50%), it should be feasible to restore tRNA levels by overexpression (strategy used in Fig. 3E, S3B) and determine whether this measurably rescues growth in H2O2-treated cells.

      Response: We thank the reviewer for their suggestion. Originally, we had also thought of this experiment and attempted to test this hypothesis. Upon experimentation, we ran into technical challenges that prevented us from drawing any conclusions. The problems were that we were unable to develop a cell line that stably overexpressed the Tyr tRNA-GUA and had to settle for a transient overexpression that only lasted for a couple of days (Fig S3B). For transient transfection, we used Lipofectamine 3000 (Invitrogen) that has associated cell toxicities and requires a control RNA transfection in lipofectamine. In addition, H2O2 in itself is a stress. The simultaneous occurrence of these two stresses led to a combination of cell death and cell growth for the control and experimental group. Given the high variability, we were unable to draw any conclusions on cell growth with this combination. We hope to identify a way to stably overexpress Tyr tRNA-GUA in the future to address this hypothesis.

      Comment 3: Knockdown of YARS/tyr tRNA-GUA resulted in reduced expression of EPCAM, SCD, and USP3 at both the protein and mRNA levels (Fig. 4C-D, S4C). In contrast, H2O2-exposure reduced the abundance of these proteins without affecting mRNA levels (Fig. 5A-B, S5A). The authors should comment on this apparent discrepancy. Perhaps translational stalling induces No-Go decay, but it is unclear why this response would not also be triggered by ROS.

      Response: We would like to clarify that out of the three genes in Fig. S5A, only EPCAM mRNA levels were significantly reduced with H2O2-exposure while no changes were observed in the mRNA levels of USP3 or SCD. It is difficult to ascertain the reason for EPCAM mRNA reduction but one hypothesis is due to timing and steady state levels. Levels of mRNAs seen with knockdown of YARS or tRNA represent steady state levels where mRNA decay and transcriptional changes can be easily seen. Following H2O2, the data is collected at 24 hours, which may be before mRNA effects can be fully appreciated. We have edited the text to clarify the uncertainty involved. We agree with the reviewer’s insightful comment and find these differences to be interesting and will consider them in future studies to better understand the interplay between translation and mRNA levels in the context of tRNA depletion.

      Comment 4: In addition to the analyses of ribosome profiling in Fig. 5E-F, it might also be helpful to show a metagene analysis of ribosome occupancy centered upon UAC/UAU codons (for an example, see Figure 2 of Schuller et al., Mol Cell, 2017). This has previously been used as an effective way to visualize ribosome stalling at specific codons. Additionally, do the authors see a global correlation between tyrosine codon density and reduced translational efficiency in tRNA knockdown cells?

      Response: We thank the reviewer for their important suggestion. We have expanded the analysis to look at codon usage scatterplots across all codons for shTyr and shControl replicates (Fig S5D). The 5 most changed codons are labeled with UAC, a codon for the tyrosine amino acid, being the most affected (red arrow). Consistent with our model, a tyrosine codon, when at the ribosome A-site, is most affected with depletion of the corresponding tRNA. The text has also been edited to reflect our new analysis providing further evidence that ribosomal stalling could occur upon depletion of this tRNA. The gray outline around the regression line represents the 95% confidence interval.

      Fig S5D

      As seen in Fig 5F, a significant overlap was noted for genes with the lowest translational efficiency and tyrosine enrichment. We did further analysis to test if a direct and linear relationship exists between tyrosine codon density and reduced translational efficiency on the global scale (i.e. does more stalling occur with more tyrosine codons on a global scale). We again see that a reduced translational efficiency is significantly correlated with tyrosine codon enrichment (above median parameters) in the tRNA knockdown ribosome profiling data. However, our analysis on a direct relationship between codon density and translational efficiency is inconclusive. This analysis is limited given the sequencing depth and number of experimental replicates available and we lack the statistical power to draw strong conclusions. To prevent overstating our claims, we have omitted any conclusions regarding this second analysis.

      Comment 5: MINOR: On pg. 4, the authors state that tRF-tyrGUA is the most highly induced tRF, but Fig. S1B appears to show stronger induction of tRF-LeuTAA.

      Response: The reviewer is correct in that the data from Fig S1B shows Leu-tRFs with higher induction. Our text was meant to suggest we focused on tRF-TyrGUA due to higher band intensity seen on northern blot validation. We have edited the text in the manuscript to clarify this.

      Reviewer #2 (Significance (Required)): The major advance provided by this work is the demonstration that stress-induced tRNA cleavage can reduce the abundance of the mature tRNA pool sufficiently to impact translation. Moreover, the effect on mature tRNAs is selective, resulting in the reduced translation of a specific set of mRNAs under these conditions. These findings reveal previously unknown consequences of oxidative stress on gene expression and will be of interest to scientists working on cellular stress responses and post-transcriptional regulation.

      Response: We thank the reviewer for the kind comments and feedback.

      REFEREES CROSS-COMMENTING Regarding the concern that the disappearance of the pre-tRNA could be a transcriptional response (reviewer 2), I think that the appearance of tRFs makes this scenario unlikely. If pre-tRNA levels decreased due to transcriptional repression, wouldn't one expect that both tRNA and the tRF levels diminish concomitantly? Here is what I was thinking: The generation of tRFs does not generally result in reduction in levels of the mature tRNAs. So you can imagine a scenario where oxidative stress causes tRF generation from the mature tyr tRNA (which does not impact its steady-state levels), as is the case for other tRNAs. At the same time, decreased transcription would reduce the pre-tRNA pool, leading to a delayed reduction in mature tRNA, as observed. However, looking back at the data, I see that after only 5 min of H2O2 treatment, the authors observed reduced pre-tRNA and increased tRFs (Fig. 2A). This seems very fast for a transcriptional response, which would presumably require some kind of signal transduction. In addition, when you consider the amount of tRFs produced in Fig. S2C, it is hard to imagine that this would not impact the mature tRNA pool if they were derived from there. So I agree that the transcriptional scenario seems unlikely. Nevertheless, I think that looking at pre-tRNA degradation directly with the pulse-chase strategy would strengthen their story, so I would like to give the authors this suggestion. However, I am fine with listing this as an optional experiment which would enhance the paper but should not be essential for publication.

      Response: We thank the reviewer for these insightful comments. As mentioned above, five minutes is likely too rapid for a transcriptional response to be the main effect of H2O2 on Tyr-tRNA GUA. Moreover, the concomitant appearance of the tRF at this time-point makes tRNA fragmentation the most parsimonious and likely explanation rather than transcriptional repression, which would not cause a tRNA fragment to occur concurrently. Moreover, extraction and sequencing of the tRF shows it likely derives from the pre-tRNA as a 5’ leader sequence is present. We appreciate the reviewer’s suggestion and scholarly willingness to reassess their own hypothesis.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)): The major findings in this manuscript are: 1.) Oxidative stress in human cells causes a decrease in tyrosine tRNA levels and accumulation of tyrosine tRNA fragments; 2.) The depletion of tyrosyl-tRNA synthetase or tyrosine tRNAs in human cells results in altered translation of certain genes and reduced cell growth and 3.) hnRNPA1 and SSB/La can bind tyrosine tRNA fragments. There is also preliminary evidence that the DIS3L2 endonuclease contributes to the appearance of tyrosine tRNA fragments upon oxidative stress. Based upon these results, the Authors conclude that tyrosine tRNA depletion is part of a conserved stress-response pathway to regulate translation in a codon-based manner. **Major comments:** Comment 1: There is a considerable amount of data in this paper and the experiments are performed in a generally rigorous manner. Sufficient details are provided for reproducing the findings and all results have been provided to appropriate databases (RNA-Seq and ribosome profiling).

      Response: We thank the reviewer for the positive comments and feedback.

      Comment 2: The manuscript uses a probe against the 5' half of Tyrosine tRNA for Northern blotting. However, tRNA probes can be prone to cross-hybridization, especially with some tRNA isoacceptors being similar in sequence. Thus, the blots in Figure 2 and Supplemental Figures should be probed with an oligonucleotide against the 3' half of tRNA-Tyr. This will confirm the pre- and mature tRNA-Tyr bands detected with the 5' probe. Moreover, this will determine whether 3' tRNA-Tyr fragments accumulate.

      Response: We agree that the reviewer is correct in suggesting that the 3’ tRNA-Tyr might also accumulate. However, we disagree that any accumulation of the 3’ tRF might be relevant in our particular model for multiple reasons. As supported by reviewer 1’s cross-comments, cross-hybridization between isoacceptors (GUA vs AUA) would be unlikely as Tyr-AUA could not even be detected by the initial 5’ tRF probe. Additionally, the sequences for Tyr-GUA are different with no nucleotide alignment from Tyr-AUA. Furthermore, the extraction and sequencing of the 5’ tRF (Fig S6B) confirms the 5’ leader sequence unique to the pre-tRNA (also noted by reviewer 1). While the 3’ half of many Tyr-GUA are similar, we find selective binding of our RNA binding proteins only to the 5’ tRF. The 3’ tRF may play some role in binding to other proteins in cell regulatory pathways but such experiments would be outside the scope of this study.

      Comment 3: The analysis of the proteomic and ribosome profiling experiments seem rather limited, or based upon what was presented in this manuscript. If additional analyses were performed, then they should be included as well, even if they yielded negative results. For example, the manuscript identifies 102 proteins that decrease after tRNA-Tyr depletion and YARS-depletion with a certain threshold of Tyr codon content. We realize the Authors were trying to find potential genes that are modulated under all three conditions. However, this does not provide information whether there is a relationship between a certain codon such as Tyr and protein abundance if only binning into two categories representing below and above a certain codon content. The Authors should plot the abundance change of each detected protein versus each codon and determine the correlation coefficient. This analysis is important for substantiating the conclusion of a codon-based system of specifically modulating transcripts enriched for certain codons. Otherwise, how could changes in tRNA-Tyr levels modulate codon-dependent gene expression if two different transcripts with the same Tyr codon content exhibit differences in translation? Moreover, this analysis should be performed with all the other codons as well.

      Response: We have identified an error in our manuscript where the overlap identified 109 proteins and not 102 as reported previously. We apologize for this oversight. While the reviewer is correct in that identifying codon dependent changes for all 3500+ proteins detected would offer greater insight, our study was specifically focused on tyrosine as we observed this tRNA to become depleted and our experimental system modulated this specific tRNA. As for the second point on Tyr tRNA level effects on translation, we felt that the most rigorous course would be to assess causality rather than an association for this tRNA and its codon in regulating a target gene. The only way to do this is to perform mutagenesis and reporter studies. Our codon dependent reporter clearly shows a direct effect on translation in a tyrosine-codon dependent manner. As for translational regulation for two different transcripts with the same Tyr codon content, it is unclear the molecular mechanisms that could dictate these differences. The reviewer has already brought up possibilities in the next comment regarding Tyr codons in 5’ or 3’ ends or consecutive Tyr codons. These are all interesting hypotheses that others in the field have devoted entire publications to try and understand how and why codon interactions and localizations impact translation (see Gamble et al., Cell 2016, Kunec and Osterreider, Cell Reports 2016, Gobet et al., PNAS 2020). While these further analyses would be interesting, our current experimental data would be insufficient to properly address these questions. We have focused on a specific tRNA, its fragment, and demonstrated direct effects of the tRNA on the codon-dependent translation of a specific growth-regulating target gene and the tRNA fragment on the modulation of the activity of the RNA binding protein it binds to with respect to its regulon. We believe that these findings individually reveal causal roles for this tRNA and tRF in downstream gene regulation and collectively reveal a previously unappreciated post-transcriptional response. We hope the reviewer agrees with us regarding the already deep extent of the studies and that further such analyses beyond this tRNA are outside the scope and focus of this current study.

      Comment 4: The Authors should provide the specific parameters used to calculate the median abundance of Tyr codons in a protein and the list of proteins containing higher than median abundance of Tyr codon content. Moreover, the complete list of 102 candidate genes should also be provided. This will allow one to determine what percentage of these Tyr-enriched proteins exhibited a decrease in levels. Moreover, is there anything special about these Tyr codon-enriched transcripts where they are affected at the level of translation but not the other Tyr-codon enriched transcripts? For example, are these transcripts enriched at the 5' or 3' ends for Tyr codons? Do these transcripts exhibit multiple consecutive Tyr codons? This deeper analysis would enrich the findings in this manuscript.

      Response: For the proteins identified in the mass spectrometry and overlap listed in Fig 4A, Tyr codon abundance was calculated by dividing the number of Tyr amino acids present by the total number of amino acids for each protein. For genes with different isoforms possible, the principal isoform, using ENSEMBL, was used for calculations. We are also happy to provide the entire list of proteins. Additionally, please see above response to comment 3. We wish to emphasize that the goal of identification of these proteins was to identify downstream targets of this response for functional studies, which we have done. We have identified downstream genes that become modulated by this response and that regulate cell growth, consistent with the phenotype of the tRNA. We then demonstrated a direct causal tRNA-dependent codon-based response with a specific target gene using mutagenesis.

      While we agree that the additional analysis the reviewer is requesting to determine what constitutes heightened translational sensitivity to this response is interesting, we believe this is a challenging question for future studies. It is possible that enrichment at 5’ or 3’ or concentration of tyrosine codons could cause increased sensitivity. Ideally, one would have information on a larger set of proteins so that such challenging questions could be better statistically bolstered. Ultimately, the requested experiments that go beyond our current work would require further analyses and experiments to allow firm conclusions to be drawn. As the other reviewers state and this reviewer agrees, we have uncovered the initial discovery regarding this tRNA fragmentation response and provided mechanistic characterization. Future studies, which are beyond the scope of the current work will undoubtedly further characterize features of this response.

      Comment 5: The ribosome profiling results are condensed into two panels of Figure 5E and 5F. We recommend the ribosome profiling experiment be expanded into its own figure with more extensive analysis and comparison beyond just looking at tRNA-Tyr. This could reveal insight into other codons that are impacted coordinately with Tyr codons and perhaps strengthen their conclusion. As an example of a more thorough analysis of ribosome profiling and proteomics, we point the Authors to this recent paper: Lyu et al. 2020 PLoS Genetics, https://journals.plos.org/plosgenetics/article?id=10.1371/journal.pgen.1008836

      Response: We thank the reviewer for their suggestion. We have expanded the analysis to look at codon usage scatterplots across all codons for shTyr and shControl replicates (Fig S5D). The 5 most changed codons are labeled with UAC, a codon for the tyrosine amino acid, being the most affected (red arrow). Consistent with our model, a tyrosine codon, when at the ribosome A-site, is most affected with depletion of the corresponding tRNA. The text has also been edited to reflect our new analysis providing further evidence that ribosomal stalling might occur with depletion of a given tRNA. The gray outline around the regression line represents the 95% confidence interval.

      Fig S5D

      Comment 6: Moreover, one would expect that the mRNAs encoding USP3, EPCAM and SCD would exhibit increased ribosome occupancy. Thus, the authors should at least provide relative ribosome occupancy information on these transcripts to provide evidence that the decrease in protein levels is indeed linked to ribosome pausing or stalling.

      Response: We would like to emphasize that resolution of ribosomal profiling data at the codon level for specific genes requires a high number of reads and replicates to draw accurate conclusions. There is an inherent level of stochasticity when mapping RPFs to specific genes and as a result, our analysis revolved around Tyr-enriched vs Tyr-low populations as this analysis was appropriate for our sequencing depth and number of replicates. To be able to conclusively make claims regarding ribosome pausing or stalling for specific genes, we would likely need further experimentation than can be currently done. However, we are currently conducting the requested bioinformatic analysis and have promising preliminary transcript-level data supporting our model.

      Comment 7: The results with hnRNPA1 and SSB/La are extremely preliminary and simply show binding of tRNA fragments but no biological relevance. We realize that the Authors attempted to see if Tyr-tRNA fragments impacted RNA Pol III RNA but found no effect. A potential experiment would be to perform HITS-CLIP on H2O2-treated cells to see if stress-induced tRNA fragments bind to SSB/La or hnRNPA1. In this case, at least the Authors would link the oxidative stress results found in Figure 1 and 2 with La/SSB and hnRNPA1.

      Response: We agree with the reviewer that a tRF function was not established in the manuscript. As a result, we have recently completed experiments looking at mRNA stability of the hnRNPA1 regulon in the context of overexpressing the tRF as well as using LNA to inhibit this Tyr-tRF (Fig 7E-F). Our data shows, in an hnRNPA1-dependent manner, that its regulon can be functionally regulated by Tyr-tRF. With tRF overexpression and RNAi-mediated depletion of hnRNPA1, a right shift in transcript stability is seen. Importantly, when we do the converse experiment with tRF inhibition in the same RNAi-mediated reduction of hnRNPA1, we see a left shift. These complementary experiments provide data that the Tyr-tRF has a functional role when bound to hnRNPA1 by modulating the regulon of hnRNPA1 and expand the scope of this manuscript and extend the pathway defined downstream of this tRNA fragmentation event.

      Fig 7E-F

      Comment 8: The manuscript concludes that "Tyrosyl tRNA-GUA fragments are generated in a DIS3L2-dependent manner" based upon data in Supplemental Figure S7. However, there is still a substantial amount of tyrosine tRNA fragments in both worms and human cells depleted of DIS3L2. Thus, DIS3L could play a role in the formation of Tyrosine tRNA fragments but it is too strong a claim to say that tRNA fragments are "dependent" upon DIS3L2. We suggest that the Authors soften their conclusions.

      Response: While there are certainly tRFs still apparent with DIS3L2 depletion (Fig S7F-I), we note significant impairment of tRF induction with DIS3L2 knockdown/knockout with multiple different methods in C. elegans and human cells. This data supports our conclusion that tRF generation is dependent on DIS3L2 as this ribonuclease is necessary to elicit the full Tyr-tRF response. We do not make claims that Tyr-tRFs are solely or completely dependent on DIS3L2. There must be other RNases involved given the data highlighted by the reviewer. To this point, we have added clarifying text that DIS3L2 depletion does not completely eliminate the tRF induction.

      Comment 9: Moreover, what is the level of DIS3L2 depletion in the worm and human cell lines? The Authors should provide the immunoblot of DIS3L2 that was described in the Materials and Methods.

      Response: An immunoblot of DIS3L2 depletion in human cells has now been added as a supplementary figure (Fig S7I). Depletion in C. elegans was confirmed through sequencing of a mutation, as is standard in the field. The wild-type PCR product is 1nt longer (859 bp) than the mutant product (858 bp) with CTC to TAG nonsynonymous mutation preceding a single nucleotide deletion.

      Wild-type disl-2: GTTGAAGCCGCAGGGC[CTC]ACTCAGACAGCTACAGG

      disl-2 (syb1033): GTTGAAGCCGCAGGGC[TAG]-CTCAGACAGCTACAGG

      Fig S7I

      Comment 10: The key conclusions of "a tRNA-regulated growth suppressive oxidative stress response pathway" and an "underlying adaptive codon-based gene regulatory logic inherent to the genetic code" are overstated. This is because of the major caveat that knockdown of tyrosine-tRNA or tyrosyl-tRNA synthetase are likely to trigger numerous indirect effects. While the authors validate that three proteins are expressed at lower levels under all three conditions (H2O2, tRNA-Tyr and YARS), they might overlap in some manner but not necessarily define a coordinated response. Thus, a glaring gap in this paper is a clear, mechanistic link between H2O2-induced changes in translation versus the changes in expression when either tRNA-Tyr or YARS is depleted. Thus, it is too preliminary to conclude that tRNA depletion is part of a "pathway" and "regulatory logic" when it could all be pleiotropic effects. At the very least, the authors should discuss the possibility of indirect effects to provide a more nuanced discussion of the results obtained using two different cell systems and oxidative stress.

      Response: We thank the reviewer for the feedback. While we agree that indirect effects may exist, we do not make any claims that our pathway is the only one required to have translation effects. The text for Fig 4A already acknowledges the pleiotropic effects of tRNA depletion. Our data shows that H2O2 stress leads to a depletion of Tyr tRNA-GUA and that depletion of this tRNA through multiple complementary methods has a codon-dependent effect on protein expression. We hope the reviewer agrees that the reduction of a specific target gene in a tyrosine codon-dependent manner (demonstrated by mutagenesis) and the binding of the tRF directly to an RBP and the modulation of the regulon of this RBP by this tRF (demonstrated by gain- and loss-of-function studies) demonstrates a direct role of this response on specific downstream target genes rather than pleiotropy. This is in keeping with the cross-comments of reviewer 1, where Fig 5D shows a direct Tyr codon link between H2O2 and downstream effects. As a result, we feel that our conclusions of a pathway (not the only pathway) are valid. However, the conclusion of a “regulatory logic” might not be interpreted in the same way by all readers and we have thus changed the text to reflect a more nuanced position.

      **Minor comments:** Comment 11: Tyrosyl-tRNAs refers to the aminoacylated form of tRNA. We recommend that all instances of tyrosyl-tRNA be changed to tyrosine tRNA or tRNA-Tyr which is more generic and provides no indication as to the aminoacylation status of a tRNA.

      Response: We thank the reviewer for their correction. We have changed all instances of “tyrosyl” to “tyrosine” in the text.

      Comment 12: In Figure 5C, the promoter is drawn as T7, which is a bacteriophage promoter. While the plasmid used in this manuscript (psiCHECK2) does contain a T7 promoter, mammalian gene expression is driven from the SV40 promoter. Thus, the relevant label in Figure 5C should be "SV40 promoter". Moreover, additional details should be provided on how the construct was made (such as sequence information etc.).

      Response: We thank the reviewer for their correction. We have changed the promoter text in the figure. In the methods for the construct, we have included which USP3 was used and would be happy to include further information if requested.

      Comment 13: Please provide original blots for each of the replicates in: Figure 4C, n=4 Figure 4A, n=9 Figure 4D, n=3 Figure 5D, n=3

      Response: There appears to be an unintentional mislabeling of the requested blots by the reviewer. The original blots for Fig 4C, Fig 5A, Fig 5D, and Fig 6D have been made available in a separate file for reviewers.

      Reviewer #3 (Significance (Required)): This manuscript provides evidence that specific tRNAs are depleted upon oxidative stress as part a conserved stress-response pathway in humans (and worms) to regulate translation in a codon-based manner. Unfortunately, the manuscript attempts to tie together results from different conditions and systems without providing any definitive links that suggest a "pathway" involved in the oxidative stress response. The findings in this paper provide a useful starting point but fall short of being a major advance due to the lack of a clear mechanism. However, there are intriguing results in this manuscript based upon the cell lines depleted of tRNA-Tyr or tyrosine synthetase that could interest researchers in the field of tRNA biology.

      Response: We thank the reviewer for the positive comments regarding our demonstration of a conserved stress response, acknowledging the intriguing nature of our findings that will be a starting point for future studies and that our work will be of interest to researchers in the field of tRNA biology. We hope that the very positive comments of reviewer 1 and 2, the cross-comments of reviewer 1 in response to reviewer 3’s comments regarding the specificity of this response, and our inclusion for reviewer 3 of additional data on the function of the tRF in regulating the activity of the hnRNPA1 RNA binding protein defining a post-transcriptional pathway and additional corroborating requested codon-level computational analyses provide compelling support that that our findings indeed represent a major advance for the field.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      The major findings in this manuscript are: 1.) Oxidative stress in human cells causes a decrease in tyrosine tRNA levels and accumulation of tyrosine tRNA fragments; 2.) The depletion of tyrosyl-tRNA synthetase or tyrosine tRNAs in human cells results in altered translation of certain genes and reduced cell growth and 3.) hnRNPA1 and SSB/La can bind tyrosine tRNA fragments. There is also preliminary evidence that the DIS3L2 endonuclease contributes to the appearance of tyrosine tRNA fragments upon oxidative stress. Based upon these results, the Authors conclude that tyrosine tRNA depletion is part of a conserved stress-response pathway to regulate translation in a codon-based manner.

      Major comments:

      •There is a considerable amount of data in this paper and the experiments are performed in a generally rigorous manner. Sufficient details are provided for reproducing the findings and all results have been provided to appropriate databases (RNA-Seq and ribosome profiling).

      •The manuscript uses a probe against the 5' half of Tyrosine tRNA for Northern blotting. However, tRNA probes can be prone to cross-hybridization, especially with some tRNA isoacceptors being similar in sequence. Thus, the blots in Figure 2 and Supplemental Figures should be probed with an oligonucleotide against the 3' half of tRNA-Tyr. This will confirm the pre- and mature tRNA-Tyr bands detected with the 5' probe. Moreover, this will determine whether 3' tRNA-Tyr fragments accumulate.

      •The analysis of the proteomic and ribosome profiling experiments seem rather limited, or based upon what was presented in this manuscript. If additional analyses were performed, then they should be included as well, even if they yielded negative results. For example, the manuscript identifies 102 proteins that decrease after tRNA-Tyr depletion and YARS-depletion with a certain threshold of Tyr codon content. We realize the Authors were trying to find potential genes that are modulated under all three conditions. However, this does not provide information whether there is a relationship between a certain codon such as Tyr and protein abundance if only binning into two categories representing below and above a certain codon content. The Authors should plot the abundance change of each detected protein versus each codon and determine the correlation coefficient. This analysis is important for substantiating the conclusion of a codon-based system of specifically modulating transcripts enriched for certain codons. Otherwise, how could changes in tRNA-Tyr levels modulate codon-dependent gene expression if two different transcripts with the same Tyr codon content exhibit differences in translation? Moreover, this analysis should be performed with all the other codons as well.

      •The Authors should provide the specific parameters used to calculate the median abundance of Tyr codons in a protein and the list of proteins containing higher than median abundance of Tyr codon content. Moreover, the complete list of 102 candidate genes should also be provided. This will allow one to determine what percentage of these Tyr-enriched proteins exhibited a decrease in levels. Moreover, is there anything special about these Tyr codon-enriched transcripts where they are affected at the level of translation but not the other Tyr-codon enriched transcripts? For example, are these transcripts enriched at the 5' or 3' ends for Tyr codons? Do these transcripts exhibit multiple consecutive Tyr codons? This deeper analysis would enrich the findings in this manuscript.

      •The ribosome profiling results are condensed into two panels of Figure 5E and 5F. We recommend the ribosome profiling experiment be expanded into its own figure with more extensive analysis and comparison beyond just looking at tRNA-Tyr. This could reveal insight into other codons that are impacted coordinately with Tyr codons and perhaps strengthen their conclusion. As an example of a more thorough analysis of ribosome profiling and proteomics, we point the Authors to this recent paper: Lyu et al. 2020 PLoS Genetics, https://journals.plos.org/plosgenetics/article?id=10.1371/journal.pgen.1008836

      •Moreover, one would expect that the mRNAs encoding USP3, EPCAM and SCD would exhibit increased ribosome occupancy. Thus, the authors should at least provide relative ribosome occupancy information on these transcripts to provide evidence that the decrease in protein levels is indeed linked to ribosome pausing or stalling.

      •The results with hnRNPA1 and SSB/La are extremely preliminary and simply show binding of tRNA fragments but no biological relevance. We realize that the Authors attempted to see if Tyr-tRNA fragments impacted RNA Pol III RNA but found no effect. A potential experiment would be to perform HITS-CLIP on H2O2-treated cells to see if stress-induced tRNA fragments bind to SSB/La or hnRNPA1. In this case, at least the Authors would link the oxidative stress results found in Figure 1 and 2 with La/SSB and hnRNPA1.

      •The manuscript concludes that "Tyrosyl tRNA-GUA fragments are generated in a DIS3L2-dependent manner" based upon data in Supplemental Figure S7. However, there is still a substantial amount of tyrosine tRNA fragments in both worms and human cells depleted of DIS3L2. Thus, DIS3L could play a role in the formation of Tyrosine tRNA fragments but it is too strong a claim to say that tRNA fragments are "dependent" upon DIS3L2. We suggest that the Authors soften their conclusions.

      •Moreover, what is the level of DIS3L2 depletion in the worm and human cell lines? The Authors should provide the immunoblot of DIS3L2 that was described in the Materials and Methods.

      •The key conclusions of "a tRNA-regulated growth suppressive oxidative stress response pathway" and an "underlying adaptive codon-based gene regulatory logic inherent to the genetic code" are overstated. This is because of the major caveat that knockdown of tyrosine-tRNA or tyrosyl-tRNA synthetase are likely to trigger numerous indirect effects. While the authors validate that three proteins are expressed at lower levels under all three conditions (H2O2, tRNA-Tyr and YARS), they might overlap in some manner but not necessarily define a coordinated response. Thus, a glaring gap in this paper is a clear, mechanistic link between H2O2-induced changes in translation versus the changes in expression when either tRNA-Tyr or YARS is depleted. Thus, it is too preliminary to conclude that tRNA depletion is part of a "pathway" and "regulatory logic" when it could all be pleiotropic effects. At the very least, the authors should discuss the possibility of indirect effects to provide a more nuanced discussion of the results obtained using two different cell systems and oxidative stress.

      Minor comments:

      •Tyrosyl-tRNAs refers to the aminoacylated form of tRNA. We recommend that all instances of tyrosyl-tRNA be changed to tyrosine tRNA or tRNA-Tyr which is more generic and provides no indication as to the aminoacylation status of a tRNA.

      •In Figure 5C, the promoter is drawn as T7, which is a bacteriophage promoter. While the plasmid used in this manuscript (psiCHECK2) does contain a T7 promoter, mammalian gene expression is driven from the SV40 promoter. Thus, the relevant label in Figure 5C should be "SV40 promoter". Moreover, additional details should be provided on how the construct was made (such as sequence information etc.).

      •Please provide original blots for each of the replicates in:

      Figure 4C, n=4

      Figure 4A, n=9

      Figure 4D, n=3

      Figure 5D, n=3

      Significance

      This manuscript provides evidence that specific tRNAs are depleted upon oxidative stress as part a conserved stress-response pathway in humans (and worms) to regulate translation in a codon-based manner. Unfortunately, the manuscript attempts to tie together results from different conditions and systems without providing any definitive links that suggest a "pathway" involved in the oxidative stress response. The findings in this paper provide a useful starting point but fall short of being a major advance due to the lack of a clear mechanism. However, there are intriguing results in this manuscript based upon the cell lines depleted of tRNA-Tyr or tyrosine synthetase that could interest researchers in the field of tRNA biology.

      This review is written from the perspective of a researcher with expertise in RNA processing, RNA biology and translation regulation.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      This very interesting study from Sohail Tavazoie's lab describes the consequences of oxidative stress on the tRNA pool in human epithelial cell lines. As previously described, the authors observed that tRNA fragments were generated upon exposure of cells to ROS. In addition, the authors made the novel observation that specific mature tRNAs were also depleted under these conditions. In particular, the authors focused on tyrosyl tRNA-GUA, which was decreased ~50% after 24 hours of ROS exposure, an effect attributable to a decrease in the pre-tRNA pool. Depletion of tyrosyl tRNA resulted in reduced translation of specific mRNAs that are enriched in tyr codons and likely contributed to the anti-proliferative effects of ROS exposure. In addition, the authors demonstrated that the tRFs produced from tyr tRNA-GUA can interact with specific RNA binding proteins (SSB and hnRNPA1).

      The major contribution of this paper is the novel finding that stress-induced tRNA fragmentation can result in a measurable reduction of specific mature tRNAs, leading to a selective reduction in translation of mRNAs that are enriched for the corresponding codons. Previously, studies of tRNA fragmentation largely focused on the functions of the tRFs themselves and it was generally believed that the mature tRNA pool was not impacted sufficiently to reduce translation. The findings reported here therefore add a new dimension to our understanding of the cellular consequences of stress-induced tRNA cleavage.

      Overall, the data are of high quality, the experiments are convincing, and the conclusions are well supported. I have the following suggestions that would further strengthen the study and bolster the conclusions.

      1.The authors have not formally demonstrated that the reduction in pre-tRNA in H2O2-treated cells is a consequence of pre-tRNA cleavage. It is possible that reduced transcription contributes to this effect. Pulse-chase experiments with nucleotides such as EU would provide a tractable approach to demonstrate that a labelled pool of pre-tRNA is rapidly depleted upon H2O2 treatment, which would further support their model. Since the response occurs rapidly (within 1 hour), it would be feasible to monitor the rate of pre-tRNA depletion during this time period in control vs. H2O2-treated cells.

      2.To what extent is the growth arrest that results from H2O2 treatment attributable to tyr tRNA-GUA depletion (Fig. 3A)? Since the reduction in tRNA levels is only partial (~50%), it should be feasible to restore tRNA levels by overexpression (strategy used in Fig. 3E, S3B) and determine whether this measurably rescues growth in H2O2-treated cells.

      3.Knockdown of YARS/tyr tRNA-GUA resulted in reduced expression of EPCAM, SCD, and USP3 at both the protein and mRNA levels (Fig. 4C-D, S4C). In contrast, H2O2-exposure reduced the abundance of these proteins without affecting mRNA levels (Fig. 5A-B, S5A). The authors should comment on this apparent discrepancy. Perhaps translational stalling induces No-Go decay, but it is unclear why this response would not also be triggered by ROS.

      4.In addition to the analyses of ribosome profiling in Fig. 5E-F, it might also be helpful to show a metagene analysis of ribosome occupancy centered upon UAC/UAU codons (for an example, see Figure 2 of Schuller et al., Mol Cell, 2017). This has previously been used as an effective way to visualize ribosome stalling at specific codons. Additionally, do the authors see a global correlation between tyrosine codon density and reduced translational efficiency in tRNA knockdown cells?

      5.MINOR: On pg. 4, the authors state that tRF-tyrGUA is the most highly induced tRF, but Fig. S1B appears to show stronger induction of tRF-LeuTAA.

      Significance

      The major advance provided by this work is the demonstration that stress-induced tRNA cleavage can reduce the abundance of the mature tRNA pool sufficiently to impact translation. Moreover, the effect on mature tRNAs is selective, resulting in the reduced translation of a specific set of mRNAs under these conditions. These findings reveal previously unknown consequences of oxidative stress on gene expression and will be of interest to scientists working on cellular stress responses and post-transcriptional regulation.

      REFEREES CROSS-COMMENTING

      Regarding the concern that the disappearance of the pre-tRNA could be a transcriptional response (reviewer 2), I think that the appearance of tRFs makes this scenario unlikely. If pre-tRNA levels decreased due to transcriptional repression, wouldn't one expect that both tRNA and the tRF levels diminish concomitantly?

      Here is what I was thinking: The generation of tRFs does not generally result in reduction in levels of the mature tRNAs. So you can imagine a scenario where oxidative stress causes tRF generation from the mature tyr tRNA (which does not impact its steady-state levels), as is the case for other tRNAs. At the same time, decreased transcription would reduce the pre-tRNA pool, leading to a delayed reduction in mature tRNA, as observed.

      However, looking back at the data, I see that after only 5 min of H2O2 treatment, the authors observed reduced pre-tRNA and increased tRFs (Fig. 2A). This seems very fast for a transcriptional response, which would presumably require some kind of signal transduction. In addition, when you consider the amount of tRFs produced in Fig. S2C, it is hard to imagine that this would not impact the mature tRNA pool if they were derived from there. So I agree that the transcriptional scenario seems unlikely.

      Nevertheless, I think that looking at pre-tRNA degradation directly with the pulse-chase strategy would strengthen their story, so I would like to give the authors this suggestion. However, I am fine with listing this as an optional experiment which would enhance the paper but should not be essential for publication.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      The manuscript by Huh et al. reports that oxidative stress causes fragmentation of a specific tyrosine pre-tRNA, leading to two parallel outcomes. First, the fragmentation depletes the mature tRNA, causing translational repression of genes that are disproportionally rich in tyrosine codon. These genes are enriched for those involved in electron transport chain, cell cycle and growth. Second, the fragmentation generates tRNA fragments (tRFs) that bind to two known RNA binding proteins. Finally, the authors identify a nuclease that is needed for efficient formation of tyrosine tRFs.

      The authors should include a short diagram indicating the various known steps of pre-tRNA fragmentation (perhaps as a supplement) for general readers.

      I find the enrichment for mitochondrial electron transport chain (ETC) curious. The ETC includes several oxidoreductases, which may be rich in tyrosine as it is a common amino acid used in electron transfer. The depletion of the tyrosine tRNA from among many tRNAs under oxidative stress may not be incidental but related to an attempt by the cell to decrease oxygen consumption to avoid further oxidative damage. The authors could further mine their data to corroborate this hypothesis. For example, are the ETC genes among the targets of the RNA binding proteins targeted by tyrosine tRFs? This could potentially connect the effects of mature tRNA depletion and tRFs.

      In figure 4A, the authors should provide the tyrosine codon content of the overlap genes and show how much it differs from a randomly selected sample.

      Fig.6F, lower panel: the model should show pre-tRNA, as opposed to mature tRNA, because it is the former that is fragmented.

      Significance

      This study is comprehensive and novel, and includes several orthogonal and complementary approaches to provide convincing evidence for the conclusions. The main discovery is significant because it presents an important advance in post-transcriptional control of gene expression. The process of tRF formation was previously thought not to affect the levels of mature tRNA. This study changes that understanding by describing for the first time the depletion of a specific mature tRNA as its precursor form is fragmented to generate tRFs. Finally, the authors identify DIS3L2 as a nuclease involved in fragmentation. This is also an important finding as the only other suspected nuclease, albeit with contradictory evidence, is angiogenin. Collectively, the findings of this study would be of interest to a broad group of scientists. I only have a few minor comments and suggestions (see above).

      REFEREES CROSS-COMMENTING

      I have the following comments on other reviewers' critiques.

      Regarding the concern that the disappearance of the pre-tRNA could be a transcriptional response (reviewer 2), I think that the appearance of tRFs makes this scenario unlikely. If pre-tRNA levels decreased due to transcriptional repression, wouldn't one expect that both tRNA and the tRF levels diminish concomitantly?

      Reviewer 3 raises the issue of cross hybridization in Northern blots. The authors indicate that they "could not detect the other tyrosyl tRNA (tRNA Tyr AUA) in MCF10A cells by northern blot..." (page 6). Also, they gel extracted tRFs and sequenced them (figure S6B), directly identifying the fragments. I think these findings mitigate the concern of cross hybridization and clearly identify the nature of tRFs.

      Finally, I think that the codon-dependent reporter experiment (figure 5D) addresses many issues surrounding codon dependent vs indirect effects. In that experiment, the authors mutate 5 tyrosine codons of a reporter gene and demonstrate that the encoded protein is less susceptible to repression in response to oxidative stress.

  3. Aug 2020
    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Response to the References

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      In this manuscript Yan et al describe a method to perform imaging based pooled CRISPR screens based on photoactivation followed by selection and sorting of the cells with the desired phenotypes.

      They establish a system in mammalian RPE-1 cells where they integrate a photo-activatable mCherry, identify the cells of interest under the microscope based on a phenotype, automatically activate the mCherry fluorescence in these cells and then sort the desired populations by FACS. They demonstrate the reliability of their enrichment method and finally use this approach to look for factors that regulate nuclear size by a targeted pooled CRISPR screen.

      **Major points:**

      1.This year Hassle et al described a very very similar approach that they name: Visual Cell Sorting . In this case, they use a photoconvertible fluorescent protein (green-to-red conversion) to select cells with a certain visual cellular phenotype and enrich those by FACS. The Hassle et al 2020 MSB paper is only mentioned together with the other methods in the introduction in one sentence (ref #19 in this manuscript):

      " Recently, several in situ sequencing15,16 and cell isolation methods17-20 were developed which allow microscopes to be used for screening. However, these methods contain non-high throughput steps that limit their scalability."

      I think the current citation of the Hassle et al paper, is not really fair. The idea and the execution of the two approaches are almost exactly the same. Here, the authors concentrate on a CRISPR based application, but obviously the applications of the method are not limited to that. The authors should discuss how these similar ideas can be used in several different applications.

      We agree with the reviewer that we need to describe more about the Hasle et al. paper (now ref #20 in the revised manuscript) and expand our description of other applications that could be performed with the method. For this purpose, we have made the following changes:

      We have modified the relevant paragraph in the Introduction.

      p.3 the second paragraph

      Recently, an imaging based method named “visual cell sorting” was described that uses the photo-convertible fluorescent protein Dendra2 to enrich phenotypes optically, enabling pooled genetic screens and transcription profiling(Hasle, N.; Cooke, A.; Srivatsan, S.; Huang, H.; Stephany, J. J.; Krieger, Z.; Jackson, D.; Tang, W.; Pendyala, S.; Monnat, R. J., Jr.; Trapnell, C.; Hatch, E. M.; Fowler, D. M. 2020). Here, we developed an analogous approach to execute an imaging-based pooled CRISPR screen using optical enrichment by automated photo-activation of the photo-activatable fluorescent protein, PA-mCherry.

      We have also added the following paragraph in the Discussion.

      p.14 line 1

      In our study, optical enrichment was utilized for pooled CRISPR screens on phenotypes identifiable through microscopy. However, optical enrichment can be used for other purposes, as demonstrated previously(Hasle, N.; Cooke, A.; Srivatsan, S.; Huang, H.; Stephany, J. J.; Krieger, Z.; Jackson, D.; Tang, W.; Pendyala, S.; Monnat, R. J., Jr.; Trapnell, C.; Hatch, E. M.; Fowler, D. M. 2020). In a recent study by Hasle et al.(Hasle, N.; Cooke, A.; Srivatsan, S.; Huang, H.; Stephany, J. J.; Krieger, Z.; Jackson, D.; Tang, W.; Pendyala, S.; Monnat, R. J., Jr.; Trapnell, C.; Hatch, E. M.; Fowler, D. M. 2020), the process of separating cells by FACS after optical enrichment was termed “visual cell sorting”. This method was used to evaluate hundreds of nuclear localization sequence variants in a pooled format and to identify transcriptional regulatory pathways associated with paclitaxel resistance using single cell sequencing(Hasle, N.; Cooke, A.; Srivatsan, S.; Huang, H.; Stephany, J. J.; Krieger, Z.; Jackson, D.; Tang, W.; Pendyala, S.; Monnat, R. J., Jr.; Trapnell, C.; Hatch, E. M.; Fowler, D. M. 2020), demonstrating the broad applicability and power of this approach beyond CRISPR screening.

      1. While I understand that the authors mean conversion from the dark state to fluorescent state when they describe their photo-activatable mCherry, I think the term "photo-activation" can be confusing for the general reader since typically photo-conversion refers to a change in color. I would here suggest stick to the term photo-activation.

      We thank the reviewer for pointing this out and to avoid future confusion, we restricted the usage of photo-conversion to specifically indicate conversion of fluorescence from one color into another: e.g. when talking about the published visual cell sorting paper in which Dendra2 is used as a photo-convertible fluorescent protein. We use photo-activation in reference to the activation of PA-mCherry in our work.

      1. For validation of the hits coming from the nuclear size screen: Did the authors have any controls making sure that the right targets were down-regulated? This might be obvious for some of the targets (e.g. CPC proteins that are known to induce division errors display the nuclear fragmentation that the authors also observe) but especially for the ones that are less known or unknown to induce any nuclear size change, it will be important to demonstrate the specificity of the targets.

      For validating hits coming from the nuclear size screen, we have verified the successful transduction of corresponding sgRNA constructs by FACS analysis, but have not confirmed the knockdown. Before final journal publication, we propose to perform rt-qPCR on our 15 gene hits before and after knockdown to measure the percentage of knockdown separately.

      In addition, it is not clear from the figure legends and the material and methods if these phenotypes are verified by 3-4 gRNAs they use in the validation. Are the histograms representative of a single experiment with one gRNA or a combination of gRNAs in different experiments? Methods of replication of the data presented in Fig4 is unclear.

      We apologize for the confusion. These phenotypes were verified with pools of 3-4 sgRNAs and the histograms are representative of a single replicate infected with a mixed 3-4 sgRNA pool. We have modified the legend to Figure 5 (original Fig. 4) and the method section to explain this point.

      Minor points:

      1. Related to major point #3: I could not find much experimental info on how the hits from the screen were verified in materials and methods.

      The description of the experiment and information about the selected sgRNAs has been added in the Method section as follows:

      p.23

      Verification of hits from nuclear size screen

      For each hit in the nuclear size screen, the two sgRNAs with the highest phenotypic score in the screen and the two sgRNAs with the highest score predicted by the CRISPRi-v2 algorithm24 were selected and pooled to generate a mixed sgRNA pool of 3-4 sgRNAs (detailed information in Supplementary file 8). Cells (hTERT-RPE1 dCas9-KRAB-BFP PA-mCherry H2B-mGFP) were transduced with pooled sgRNAs targeting each gene and puromycin selected for 2 days to prepare for imaging. Cells were then seeded into 96-well glass bottom imaging dishes. Images were collected the next day and nuclear size was measured using the Auto-PhotoConverter µManager plugin. To focus on cells with successful transduction, BFP was co-expressed on the sgRNA construct and only cells with BFP intensity above a threshold value were included in nuclear size measurements. This BFP threshold was established by comparing the average BFP intensity of cells with and without sgRNA transduction (Fig.S3a).

      We agree with this important point and have changed the figure legend of Fig. 5c (original Fig. 4c) to just describe the plot:

      c, The ratios between median level of nuclear size measured from microscopy and H2B-mGFP fluorescence or FSC signal measured from FACS after knockdown, were plotted separately. TACC3, confirmed to be a control gene, was used for comparison (Grey bar).

      The typo has been corrected.

      Reviewer #1 (Significance (Required)):

      I think the idea of performing pooled screens coupled to microscopy is exciting and this approach has definitely more potential than the Craft-ID approach that the authors also discuss in their manuscript. In addition, the approach that is described in this manuscript is convincing and although the fact that the analysis part will require more work (to adapt the software to recognise different types of phenotypic readouts) in the future to make it accessible to the scientific community, the authors present sufficient evidence that the system can be robust. They also present some clever ideas such as to calculate enrichments with different photo-activation times (2sec vs 100ms) followed by separation of these populations by FACS.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      In this manuscript, Yan et al. present optical enrichment, a method for conducing pooled optical screens. Optical enrichment works by combining microscopy to mark cells of interest using the PA-mCherry photo-activatable fluorescent protein with FACS to recover them. The method is similar to other methods (Photostick, Visual Cell Sorting), and provides an alternative to in situ sequencing/FISH methods. The authors use optical enrichment to conduct a pooled optical CRISPRi screen for nuclear size. They identify and exhaustively validate hits, showing that optical enrichment works for its intended purpose. The development of a uManager protocol and discussion of the number of sgRNA's required for a genetic screen using optical enrichment were welcome. The authors' reported throughput of 1.5 million cells per eight hour experiment is impressive; and the demonstrated use of low cell number input for next generation sequencing appears promising. Overall, the manuscript is well written, the methods clear and the claims supported by the data presented.

      **General comments**

      -I found the analysis and scoring methods to be lacking, both in terms of the clarity of description and in terms of what was actually done. The authors might consider using established methods (eg https://www.biorxiv.org/content/10.1101/819649v1.full). In any case, they should revise the text to clarify what was done and address the other concerns raised below.

      -Relatedly, details regarding how to perform the experiments described are lacking. It is not clear from the text, figures, "Online Methods" section, and Supplementary Files whether all imaging is performed before activation, or whether each field of view is subject to an individual round of imaging followed by activation. It is also unclear whether cells in 96 well plates are sorted as 96 separate tubes or pooled into a single tube prior to sorting. Furthermore, at a minimum, the following details are requested for each optical enrichment "run". These details are critical considerations for those who seek to use optical enrichment in their own laboratories:

      Seeding density

      Time elapsed (in hours) between cell plating and optical enrichment

      The number of fields of view examined

      The median number of cells per field of view; the proportion of each plate's surface area that is imaged and photo-converted

      The total time taken (in hours) to perform imaging and photoconversion

      The gating protocol used for sorting by FACS (preferably including a figure with example gates for one or two experiments). The gating protocol is described for the genetic screen but not for the control experiments.

      We agree with the reviewer and apologize for the confusion that arose from our description. We also thank the reviewer for suggesting using established methods. However, MAUDE, an analysis for sorting-based CRISPR screen with multiple expression bins, might not be suitable for our study since 1) the distribution of mCherry fluorescence intensity is a reflection of photo-activation efficiency and not sgRNA effect 2) only one sorting bin is collected for each experimental condition. Our analysis is adapted from an existing method from the Weissman lab (https://github.com/mhorlbeck/ScreenProcessing).

      We agree with the reviewer regarding clarifying other points and rewrote the following part in the Method section:

      p. 20

      mIFP proof-of-principle screen, Nuclear size screen, FSC screen and H2B-mGFP screen

      For the mIFP proof-of-principle screen, mIFP positive cells (hTERT-RPE1 dCas9-KRAB-BFP PA-mCherry H2B-mGFP mIFP-NLS) and mIFP negative cells (hTERT-RPE1 dCas9-KRAB-BFP PA-mCherry H2B-mGFP) were stably transduced with the “mIFP sgRNA library” (CRISPRa library with 860 elements, see Supplementary file 5) and the “control sgRNA library” (CRISPRa library with 6100 elements, see Supplementary file 6) separately. For the nuclear size screen, FSC screen and H2B-mGFP screen, cells (hTERT-RPE1 dCas9-KRAB-BFP PA-mCherry H2B-mGFP) were stably transduced with the “nuclear size library” (CRISPRi library with 6190 elements, see Supplementary file 7). To guarantee that cells receive no more than one sgRNA per cell, BFP was expressed on the same sgRNA construct and cells were analyzed by FACS the day after transduction. The experiment only continued when 10-15% of the cells were BFP positive. These cells were further enriched by puromycin selection (a puromycin resistance gene was expressed from the sgRNA construct) for 3 days to prepare for imaging. For FSC and H2B-mGFP screens, cells were then subjected to FACS sorting. Cells before FACS (unsorted sample for FSC and H2B-mGFP screens) and top 10% cells based on either FSC signal (high FSC sample) or GFP fluorescence signal (high GFP sample) were separately collected and prepared for high throughput sequencing. For mIFP proof-of-principle screen and nuclear size screen, cells were then seeded into 96-well glass bottom imaging dishes (Matriplate, Brooks) and imaged starting from the morning of the next day (around 15 hr after plating). A series of densities ranging from 0.5E4 cells/well to 2.5E4 cells/well with 0.5E4 cells/well interval were selected and seeded. The imaging dish with cells around 70% confluency was selected to be screened on the imaging day. For mIFP proof-of-principle screen, a single imaging plate was performed for each replicate while 4 imaging plates per replicate were imaged for the nuclear size screen. When executing multiple imaging runs, 2 consecutive runs could be imaged on the same day (day run and night run). 64 (8x8, day run) or 81 (9x9, night run) fields of view were selected for each imaging well and each field of view was subjected to an individual round of imaging directly followed by photo-activation. Around 200-250 cells were present in each given field of view and 60% to 80% surface area of each well was covered. Either mIFP positive cells or cells passing the nuclear size filter were identified and photo-activated automatically using the Auto-PhotoConverter µManager plugin. The total time to perform imaging and photo-activation of a single 96-well imaging dish with around 1.5 million cells was around 8 hr. The night run generally took longer, since more fields of view were included than in the day run. Cells were then harvested by trypsinization and pooled into a single tube for isolation by FACS. Sorting gates were pre-defined using samples with different photo-activation times (e.g. 0s, 200ms, 2s) and detailed gating strategies are described in Supplementary file 1. Sorted samples were used to prepare sequencing samples.

      -The authors use PA-mCherry. There are a variety of other photo-activatable fluorophores available, and it would be good for them to comment on why they chose PA-mCherry. Also, since the method is supposed to be used for generic pooled optical screens, it would be good for the authors to comment on what colors remain available for imaging cellular structures.

      To address these, we have added the following sentences:

      p. 4 line 16

      A photo-activatable fluorescent protein was chosen over a photo-convertible fluorescent protein to increase the number of channels available for imaging. PA-mCherry was chosen to leave the better performing green channel open for labeling of other cellular features. Moreover, non-activated PA-mCherry has low background fluorescence in the mCherry channel (Fig. S1b), and it can be activated to different intensities when photo-activated for various amounts of time.

      p. **14 line 10

      Phenotypes of interest should be identifiable under the microscope and generally require fluorescent labeling. Commonly used fluorescence microscopes use four channels for fluorescent imaging with little spectral overlap: blue, green, red and far red. In our study, the red channel was occupied by cell labeling with PA-mCherry and the blue channel was used to estimate sgRNA transduction efficiency. Since sgRNA transduction efficiency can be measured by other approaches, the blue channel could be used together with the remaining two channels to label cellular structures. Combining bright field imaging with deep learning can be used to reconstruct the localization of fluorescent labels(Ounkomol, C.; Seshamani, S.; Maleckar, M. M.; Collman, F.; Johnson, G. R. 2018), making it possible to use bright field imaging to further expand the phenotypes that can be studied with our technique.

      -In general, the figures are hard to read, with most space being dedicated to beautiful but complex schematics/workflows. Points and fonts should be bigger, and the authors should consider revising the schematics to take up less space.

      We thank the reviewer for this remark and revised all figures accordingly. Points and fonts were enlarged, and schematics were simplified or removed.

      -There is extensive use of editorialzing adverbs. Adverbs such as "highly" (abstract and page 15), "easily" (pages 4 and 11), "completely" (page 11), and "only" (page 12) are unnecessary at best and unsupported by the data at worst (e.g. cells are not "completely" separable with 100 ms photo-conversion, see page 11 and Figure 1C). Please remove "completely" from page 11 and consider removing other adverbs as well.

      We agree with the reviewer and the following adverbs have been removed: “highly” in abstract and page 15; “easily” on pages 4 and 11; “completely” on page 11 and three “only” on page 12.

      -Apologies if I missed it, but I couldn't find a data availability statement. Sequencing reads from the experiments should be deposited in SRA or GEO and made available upon publication.

      We apologize that we missed this, and the sequencing data has been deposited to GEO (GSE156623) which will be made available before final publication. The following part has been added to address this.

      p. 24

      DATA AND SOFTWARE AVAILABILITY

      The raw and processed data for the high throughput sequencing results have been deposited in NCBI GEO database with the accession number (GSE156623). The plugin Auto-PhotoConverter developed for open source microscope control software μManager(Edelstein, A. D.; Tsuchida, M. A.; Amodaj, N.; Pinkard, H.; Vale, R. D.; Stuurman, N. 2014) has been deposited on github (https://github.com/nicost/mnfinder).

      **Specific comments**

      Pages 5/6 - The authors present experiments that show that optical enrichment is highly specific for desired cells. But, they should consider presenting precision (fraction of called positives that are true positive) and recall (fraction of all true positives that are called positive) instead. I think these relate more directly to a pooled optical screen than specificity.

      We apologize for our poor terminology. Our original definition of “specificity” is the same as “precision” suggested by the reviewer. To avoid future confusion, we have changed all relevant occurrences of “specificity” into “precision”. The following sentence was modified to clarify the definition:

      p. 5 line 15

      To evaluate the precision (the fraction of called positives that are true positives) of this assay, all cells were collected and analyzed by FACS after image analysis and photo-activation (Fig. 2d and 2e). We calculated precision as the fraction of photo-activated cells (mCherry positive cells) that are true positives (mIFP-mCherry double positive cells) (Fig. 2f).

      Measuring recall is complicated because the microscope is unable to visit all locations in the imaging plate, hence recall will depend on the fraction of cells actually “seen” by the microscope. For the screening strategy employed in the nuclear size screen, recall is not as important as precision, since lower recall rates are compensated for by screening larger cell numbers. We therefore did not attempt to measure recall directly.

      Page 6 - Related to the above point, the authors state "These results indicate the assay yields reliable hit identification regardless of the percentage of hits in the library." This statement seems too strong given that the authors looked at specificity experimentally with a mixture of ~1% mIFP positive cells. In fact, hits might be much less than 1% of the total population of cells, and specificity would certainly fall from the 80% measured at 1% of the total population. The authors should do a bit more to fairly discuss their ability to find rare hits.

      We agree with the reviewer and have changed the following description:

      p. 5 line 20

      The precision varied with the initial percentage of mIFP positive cells and ranged from 80% to ~100% (initial percentage of mIFP positive cells ranging between 2.3% and 43.7%) (Fig. 2f). Precision is expected to fall below 80% with initial percentage of mIFP positive cells less than 2.3%. However, these results indicate that optical enrichment can be used to identify hits with high precision even at relatively low hit rates.

      Pages 6/7 - The authors perform a validation experiment using two different sgRNA libraries, infecting mIFP- and mIFP+ cells separately. Then, they demix these populations via optical enrichment, sequence and compute a phenotype score for sgRNAs or groups of sgRNAs. The way the experiment is described and visualized is extremely confusing. If I understood correctly (and I am not sure that I did), the bottom right panel of Figure 2b shows that if sgRNAs are (randomly?) paired AND two replicates are combined then optical enrichment nearly perfectly separates all (combined, paired) sgRNAs in the two libraries. The authors should rewrite this section, especially clarifying what is meant by "1 sgRNA/group and 2 sgRNA/group," and consider changing Figure 2b (perhaps just show the lower right panel?).

      We apologize for our confusing description. To avoid the confusion, we rewrote the paragraph describing the experiment and added a schematic (Fig. 3a) to better describe this experiment. We also simplified the result by just presenting the lower right panel of original Fig. 2b (current Fig. 3b) and moved the other data into supplementary figures (Fig. S2).

      p. 6 line 4

      mIFP negative cells and mIFP positive cells were separately infected with two different CRISPRa sgRNA libraries (6100 sgRNAs for mIFP negative cells; 860 sgRNAs for mIFP positive cells) at a low multiplicity of infection (MOI) to guarantee a single sgRNA per cell. Note that in these experiments, the sgRNAs only function as barcodes to be read out by sequencing, but do not cause phenotypic changes as the cells do not express corresponding CRISPR reagents. These two populations were then mixed at a ratio of 9:1 mIFP negative cells: mIFP positive cells. We again used mIFP expression as our phenotype of interest (outlined in Fig. 3a). Two biological replicates were performed and at least 200-fold coverage of each sgRNA library was guaranteed throughout the screen, including library infection, puromycin selection, imaging/photo-activation and FACS.

      Page 8 - Related to Supplementary Figure 3, why are there not clear BFP+ and BFP- populations but instead one continuous population? How was the gating determined (e.g. how was the boundary between red and gray picked)? Here, and generally, flow plots and histograms of flow plots should indicate the number of cells. If replicates were performed, they should be included.

      We have clarified our description. There are no clear BFP+ and BFP- populations but instead one continuous population due to the background expression of BFP from the dCas9 construct: dCas9-KRAB-BFP (which is now clearly indicated in the manuscript). On top of the dCas9-KRAB-BFP, another BFP is encoded on the sgRNA construct, which leads to a higher BFP expression level.

      There was no gating in the experiment, the grey dots in the figure represents wild type cells without viral transduction while the red dots (partially covered by the grey dots) were cells infected with the two negative control sgRNAs. We mistakenly wrote the legend of original Fig. S3 (current Fig. S3a) that these were FACS data; however, the data were acquired by imaging. We apologize for the confusion and thank the reviewer for detecting the issue. We completely rewrote the legend to Fig. S3a (original Fig. S3) to clarify.

      We now include the number of cells analyzed and the number of replicates for the other flow plots and histograms in the manuscript.

      Page 8 - "Nuclear sizes...". The authors should say in the main text what size metric was used.

      To address the reviewer’s point, we have included the following sentence:

      p. 8 line 23

      We defined nuclear size as the 2D area in square microns measured by H2B-mGFP using an epifluorescence microscope, as determined by automated image analysis (Fig. 4a and Supplementary file 2).

      Page 9 - I am a little confused about the statistical analysis of the screen. In Supplementary File 1, the authors state that p-values were "calculated based on comparison between the distribution of all the phenotypic scores of sgRNAs targeting to the gene/assigning in the group and the one of negative control sgRNAs in the libraries." I presume this means that all phenotypic scores (across replicates) of all sgRNAs targeting each gene were included in a Mann Whitney U test with a single randomized set of phenotypic scores. If that's right, it seems like an odd way to get p-values. Better would be a randomization test, where a null distribution of phenotypic scores for each gene is built by randomizing sgRNA-level scores many times. Then the actual phenotypic score is compared to the randomized null distribution, yielding a p-value. In any case, the authors must clarify what they did in the main text and Supplementary File 1.

      Page 9 - It does not appear that the p-values presented in Figure 3c have been adjusted for multiple hypothesis testing. This should be done.

      Page 9 - "A value of the top 0.1 percentile of control groups was used as a cutoff for hits." Why? This seems arbitrary. It seems like appropriate false-discovery rate control would enable a more rigorous method for choosing a cutoff.

      Page 9 - The same comments regarding analysis and scoring of the optical enrichment screen applies to the FSC and GFP screens.

      We clarified the description of the statistical analysis of the screen (see new/changed text below). Mann-Whitney p-values for the two replicates were calculated independently. The Mann-Whitney U test was not performed against a randomized set of phenotypic scores, but using the phenotypic scores of the 22 control non-targeting sgRNAs that were part of the library. Because there are only 22 control sgRNAs (adding more control sgRNAs would increase the size of the library, and reduce the number of genes that can be screened within a given amount of time), the statistical significance of testing genes against these controls is not expected to be very high, and using direct approaches such as multiple hypothesis testing are not expected to yield hits. Instead, we calculated a score combining the severity (phenotypic score) and the trustworthiness (Mann-Whitney p value) of the phenotype (a method previously developed in the Weissman lab at UCSF: https://github.com/mhorlbeck/ScreenProcessing24). We thank the reviewer for suggesting using false discovery rate control as a better method for choosing a cutoff. We modified our original analysis and now determine the threshold of our score based on a calculated empirical false discovery rate (eFDR). We used this approach to maximize the number of true hits and relied on a repeat of the screen and follow-up testing of hits to narrow down true hits. We added the following part in the method section and added an analysis example to the supplementary files (Supplementary file 9)."

      p. 22

      Bioinformatic analysis of the screen

      Analysis was based on the ScreenProcessing pipeline developed in the Weissman lab (https://github.com/mhorlbeck/ScreenProcessing)**(Horlbeck, M. A.; Gilbert, L. A.; Villalta, J. E.; Adamson, B.; Pak, R. A.; Chen, Y.; Fields, A. P.; Park, C. Y.; Corn, J. E.; Kampmann, M.; Weissman, J. S. 2016). The phenotypic score (ε) of each sgRNA was quantified as previously defined(Kampmann, M.; Bassik, M. C.; Weissman, J. S. 2013)** (Supplementary file 9). For the mIFP proof-of-principle screen, phenotypic score of each group was the average score of two sgRNAs assigned to the group and averaged between two replicates except otherwise described. For the nuclear size screen, FSC screen and H2B-mGFP screen, genes were scored based on the average phenotypic scores of the sgRNAs targeting them. For the nuclear size screen, phenotypic scores were further averaged between 4 runs for each replicate. For the nuclear size screen, FSC screen and H2B-mGFP screen, sgRNAs were first clustered by transcription start site (TSS) and scored by the Mann-Whitney U test against 22 non-targeting control sgRNAs included in the library. Since only 22 control sgRNAs were included, significance of hits was assessed by comparison with simulated negative controls that were generated by random assignment of all sgRNAs in the library and phenotypic scores of these simulated negative controls were scored in the same way as phenotypic scores for genes. A score η that includes the phenotypic score and its significance was calculated for each gene and simulated negative control. The optimal cut-off for score η was determined by calculating an empirical false discovery rate (eFDR) at multiple values of η as the number of simulated negative controls with score η higher than the cut-off (false positives) divided by the sum of genes and simulated negative controls with score η higher than the cut-off (all positives). The cut-off score η resulting in an eFDR of 0.1% was used to call hits for further analysis (Supplementary file 9). An example analysis is described in detail in Supplementary file 9 and raw counts and phenotypic scores for all four screens are listed in Supplementary file 10 and 11.

      Page 9 - "These data suggest that a direct measurement utilizing a microscope can provide significant improvement in hit yield even for phenotypes that could be indirectly screened with other approaches." I think this conclusion is too strong. It rests on the assumption that the FSC/GFP phenotypes should have the same set of hits as the microscope phenotype (larger nuclear area). This may not be the case. For example, genes whose inactivation increases GFP expression would be hits in the former, but not latter case. The authors should moderate this statement.

      We agree with the reviewer and have changed the sentence into:

      p. 10 line 17

      These data suggest that a direct measurement utilizing a microscope can provide different information and reveal hits that are inaccessible using other screening approaches.

      Page 11 - "This is significantly faster than the in situ methods." The authors should provide a citation and an actual comparison to the speed of in situ methods.

      We agree with the reviewer and have modified the sentence with a citation:

      p. 12 line 20

      This is significantly faster than in situ methods which process millions of cells over a period of a few days(Feldman, D.; Singh, A.; Schmid-Burgk, J. L.; Carlson, R. J.; Mezger, A.; Garrity, A. J.; Zhang, F.; Blainey, P. C. 2019).

      Page 12 - I think the authors could say a bit more about the possibility of low hit rate screens. How low do they think it is feasible to go? What hit rates are expected based on existing arrayed optical screens?

      We have added more description in the discussion section:

      p. 13 the second paragraph

      Optical enrichment screening also is possible for phenotypic screens with relatively low hit rates (defined as the fraction of all genes screened that are true hits). The ability to detect hits at low hit rates in our method depends on multiple factors, including: 1) the penetrance of the phenotype; 2) cellular fitness effect of the phenotype; 3) detection and photo-activation accuracy of the phenotype; 4) limitations imposed by FACS recovery and sequencing sample preparations of low cell numbers. The first three factors vary with the phenotype of interest. We optimized the genomic DNA preparation protocol (Methods), and are now able to process sequencing samples from a few thousand cells, enabling screens of low hit rate phenotypes. In our nuclear size screen, more than 1.5 millions cells were analyzed during each run with 2000-4000 cells recovered after FACS sorting. The hit rate of this screen was 2.76%, similar to optical CRISPR screens performed in an arrayed format(de Groot, R.; Luthi, J.; Lindsay, H.; Holtackers, R.; Pelkmans, L. 2018)**, demonstrating the possibility to apply our approach to investigate phenotypes with low hit rates.

      Page 14 - It is weird that the discussion includes a fairly important couple of paragraphs that seem to belong in the results (e.g. the text surrounding Figure 4b and c). Obviously, I don't want to prescribe stylistic changes, but I suggest the authors consider moving this description of the experiments/analyses to the results.

      The relevant description has been moved to the results.

      Page 14 - The authors validate their hits individually, and observe that expression of hit sgRNAs does increase nuclear size in some cells. But, many/most cells remain control-like in these validation experiments. The authors should comment on why this is the case (e.g. inefficient knockdown, cell cycle effects, etc).

      To address this point, we have added the following sentences in legend of Fig. 5:

      The cell population is heterogeneous due to inefficient knockdown, incomplete puromycin selection, and penetrance of the phenotype. A BFP was expressed from the same sgRNA construct. Only cells with high BFP intensity, indicating successfully sgRNA transduction, were included for data analysis as described in Methods.

      Page 14 - It would be nice to formally compare the control and sgRNA distributions in each panel of 4a and Supplementary Figure 5 (e.g. with a Komolgorov-Smirnov test, etc). That would allow a more precise statement to be substituted for "14 out of 15 hits (the exception was TACC3) were confirmed to be real hits, with cells exhibiting larger nuclei after knock down (Fig. 4a and Fig. S5)," which is not quantitative.

      We applied the Kolmogorov-Smirnov test and the corresponding sentence was changed into:

      p. 10 last line

      *14 out of 15 hits were confirmed to be real hits (Kolmogorov-Smirnov test two tailed p-value

      Figure 2a - I am not sure it is necessary to show the entire workflow again. The first and possibly last panels are the informative ones here.

      Figure 3a - Same comment as above - these workflow panels take up a lot of real estate and I suggest simplifying them if possible.

      The figures were simplified to just show the example images.

      Figure 3c - At least on my PDF/screen, the "scrambled control" points appear very light gray and are impossible to find. They should be an easier to spot color.

      We agree with the reviewer and changed the color.

      Figure 4b - "Most cells developed a larger cellular size and higher H2B-mGFP level after knock down." I think it would be more accurate to say that the median cell size/GFP level increased, or that some cells developed larger sizes/median GFP levels.

      We agree with the reviewer’s point; “most” has been changed to “some”.

      Figure 4c - I don't understand "Normalized FITC/nuclear size." Do the bars show the mean/median of a population (if so, why not show a dot plot or box plot or violin plot)? Also, what is FITC (I presume it's GFP levels)?

      Figure 4c - "Most cells maintained a constant ratio between nuclear size and DNA content..." I'm not sure where DNA content came from. Are the authors assuming that their H2B-mGFP is a proxy for DNA content? Or was some other measurement made? If the former, is there a citable reason why this is a good assumption?

      The bars represent the ratio of the median level of H2B-mGFP intensity (the axis is now labeled with "GFP" rather than "FITC", the colloquial name for the channel used on the FACS machine) measured by FACS and the median nuclear size of the same population of cells measured by microscopy. We plan to perform additional experiments to measure DNA content using a DNA dye in the same cell by microscopy so that we will be able to correlate these on a cell by cell basis. Data will be added before final publication.

      Reviewer #2 (Significance (Required)):

      I don't generally comment on significance in reviews. Since ReviewCommons is specifically asking, I'll say that this manuscript describes optical enrichment, a method that is an extension of previous work and is substantially similar to a previously published method, Visual Cell Sorting. However, given the timing, it is obvious that these authors have been working independently on optical enrichment. Since the application is distinct, and optical enrichment incorporates some nice features like software to make it easier to execute, it is clearly of independent value.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      This study reports a rapid and high-throughput CRISPR-based phenotypic screen approach consisting of selecting cells with phenotypes of interest, label them by photo-conversion and isolating them by FACS. The idea of the method is interesting (has been around) in principle. The key advantage is that is relatively simple, accessible to many groups as it does not require robotics. However, the manuscript is so badly written and hard to follow, that it makes it difficult to judge the technology, to really understand how the experiments were done and whether the results are interpreted correctly. Strictly speaking, it is unclear whether and how good scientific practices GSP have been followed, as the description of the experiments is sometimes lacking totally. Consequently, it is impossible to seriously evaluate this study and judge whether the technology described is really promising. It is probably less sensitive than arrayed screens, in all likelihood can miss hits that affect growth, cannot capture as many phenotypic classes as one would like from high-content screens and the computational and experimental workflow is more complicated. It is puzzling that the authors don't even compare the results with arrayed screens which are of course the current gold-standard.

      We do not in any way claim that the presented method replaces arrayed screens. However, most current sgRNA libraries are pooled libraries, and the few available arrayed sgRNA libraries are expensive and difficult to maintain, hence our methods to screen pooled sgRNA libraries are timely and useful. Comparisons with arrayed screens are unwarranted as no claims are made with respect to arrayed screens.

      We have clarified the manuscript in many places, and hope it is now readable and better understandable by more readers with diverse backgrounds.

      **Specific points:**

      The specificity test (Fig 1) does not make sense how it is described. If the authors spike a certain percentage of cells that can be photoconverted, when analysing the outcome, there will be three classes: mIFP positive, mIFP/mCherry positive and negative. How can they calculate specificity if they do not know whether they converted all mIFP cells? Also the formula used is questionable or is her an error? Furthermore, it is totally unclear how many cells were used and how they were scanned. If they took 90 negative cells and 10 mIFP cells, getting them all back is easy. If they start with 10e9 cells, the specificity should be quantified. Furthermore, the phenotype they pick is an easy and convenient one. Much more challenging is to apply it on a multi-parametric phenotype. Again, this is now the gold standard.

      We used the term specificity inadvertently and should have used precision, as also pointed out by Referee 2. This has been corrected in the current manuscript. We picked the mIFP phenotype as this was a proof of principle screen to clarify the performance of our screening approach and needed a phenotype that can be measured both by microscopy and FACS. We demonstrate that multi-parametric read-outs are possible, but do not think that the first demonstration of new technology needs such an application.

      In their first sgRNA assay, it is not possible to have a clear idea of what groups they are talking about. Do they mean they get phenotypic signatures which they group? How? They need to describe what they do. Here, only ~3500 genes are scanned (the 6843 is both populations and you only select from the mIFP neg population) and it took them 8hrs. This means for the genome it would require ~60h which is indeed fast. However, this experiment is not clearly described. They cannot select the negative population since there is no fluorescent marker (except false positive which are around 1.7%). So I assume they just randomly pick cells (they should really explain much better what they do!). Why go through the hassle? If these sequences are supposed to be a negative population, just pick them in the computer. Also, they cannot calculate an enrichment compared to the negative population, since two different libraries were infected. Again, I can't follow.

      We improved the description of this experiment. To clarify, we used mIFP in a proof of concept screen to validate whether sgRNAs infecting mIFP positive cells can be distinguished from those infecting mIFP negative cells No phenotypic signature other than the mIFP signal is used (as described in the text). As customary in pooled screens, a primary comparison was made between the positive (optically selected) cells and the complete population. To improve the clarity of this screen, we further described the concept of pooled sgRNA screens, which may have made this section harder to follow.

      I find their results about calculating scores based only on true negatives surprising. The average phenotypic score is improved from 3 to 5, which is enormous. This suggests that the phenotypes induced in the mIFP population are extremely common. These results are hard to interpret given the poor description of the experiment. It is possible that it is the same dataset as in 1, but in that case, the false negatives must be rare since the negatives can be selected by absence of both mCherry and mIFP.

      There are no phenotypes induced in the mIFP population (as now explicitly explained in the text). The mIFP population is isolated using optical enrichment, and we test our ability to discriminate the sgRNAs present in the enriched population. It is unsurprising that comparing to the negatively selected population (which is not possible in most other pooled screens) is significantly better than comparing against the total population (as customary in pooled screens).

      In the nuclear size screen, 6000 sgRNAs were screened. To array so many sequences would require 20 plates. They required ~40h for imaging one replicate. This is slow, imagine the time with a 60x lens.

      There are no arrayed screens performed in our study.

      Reviewer #3 (Significance (Required)):

      Overall, there is no sufficient evidence in this manuscript to convince this reviewer that this method is valid and truly powerful. I cannot support publication in its present form.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      This study reports a rapid and high-throughput CRISPR-based phenotypic screen approach consisting of selecting cells with phenotypes of interest, label them by photo-conversion and isolating them by FACS. The idea of the method is interesting (has been around) in principle. The key advantage is that is relatively simple, accessible to many groups as it does not require robotics. However, the manuscript is so badly written and hard to follow, that it makes it difficult to judge the technology, to really understand how the experiments were done and whether the results are interpreted correctly. Strictly speaking, it is unclear whether and how good scientific practices GSP have been followed, as the description of the experiments is sometimes lacking totally. Consequently, it is impossible to seriously evaluate this study and judge whether the technology described is really promising. It is probably less sensitive than arrayed screens, in all likelihood can miss hits that affect growth, cannot capture as many phenotypic classes as one would like from high-content screens and the computational and experimental workflow is more complicated. It is puzzling that the authors don't even compare the results with arrayed screens which are of course the current gold-standard.

      Specific points:

      The specificity test (Fig 1) does not make sense how it is described. If the authors spike a certain percentage of cells that can be photoconverted, when analysing the outcome, there will be three classes: mIFP positive, mIFP/mCherry positive and negative. How can they calculate specificity if they do not know whether they converted all mIFP cells? Also the formula used is questionable or is her an error? Furthermore, it is totally unclear how many cells were used and how they were scanned. If they took 90 negative cells and 10 mIFP cells, getting them all back is easy. If they start with 10e9 cells, the specificity should be quantified. Furthermore, the phenotype they pick is an easy and convenient one. Much more challenging is to apply it on a multi-parametric phenotype. Again, this is now the gold standard.

      In their first sgRNA assay, it is not possible to have a clear idea of what groups they are talking about. Do they mean they get phenotypic signatures which they group? How? They need to describe what they do. Here, only ~3500 genes are scanned (the 6843 is both populations and you only select from the mIFP neg population) and it took them 8hrs. This means for the genome it would require ~60h which is indeed fast. However, this experiment is not clearly described. They cannot select the negative population since there is no fluorescent marker (except false positive which are around 1.7%). So I assume they just randomly pick cells (they should really explain much better what they do!). Why go through the hassle? If these sequences are supposed to be a negative population, just pick them in the computer. Also, they cannot calculate an enrichment compared to the negative population, since two different libraries were infected. Again, I can't follow.

      I find their results about calculating scores based only on true negatives surprising. The average phenotypic score is improved from 3 to 5, which is enormous. This suggests that the phenotypes induced in the mIFP population are extremely common. These results are hard to interpret given the poor description of the experiment. It is possible that it is the same dataset as in 1, but in that case, the false negatives must be rare since the negatives can be selected by absence of both mCherry and mIFP.

      In the nuclear size screen, 6000 sgRNAs were screened. To array so many sequences would require 20 plates. They required ~40h for imaging one replicate. This is slow, imagine the time with a 60x lens.

      Significance

      Overall, there is no sufficient evidence in this manuscript to convince this reviewer that this method is valid and truly powerful. I cannot support publication in its present form.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      In this manuscript, Yan et al. present optical enrichment, a method for conducing pooled optical screens. Optical enrichment works by combining microscopy to mark cells of interest using the PA-mCherry photo-activatable fluorescent protein with FACS to recover them. The method is similar to other methods (Photostick, Visual Cell Sorting), and provides an alternative to in situ sequencing/FISH methods. The authors use optical enrichment to conduct a pooled optical CRISPRi screen for nuclear size. They identify and exhaustively validate hits, showing that optical enrichment works for its intended purpose. The development of a uManager protocol and discussion of the number of sgRNA's required for a genetic screen using optical enrichment were welcome. The authors' reported throughput of 1.5 million cells per eight hour experiment is impressive; and the demonstrated use of low cell number input for next generation sequencing appears promising. Overall, the manuscript is well written, the methods clear and the claims supported by the data presented.

      General comments

      -I found the analysis and scoring methods to be lacking, both in terms of the clarity of description and in terms of what was actually done. The authors might consider using established methods (eg https://www.biorxiv.org/content/10.1101/819649v1.full). In any case, they should revise the text to clarify what was done and address the other concerns raised below.

      -Relatedly, details regarding how to perform the experiments described are lacking. It is not clear from the text, figures, "Online Methods" section, and Supplementary Files whether all imaging is performed before activation, or whether each field of view is subject to an individual round of imaging followed by activation. It is also unclear whether cells in 96 well plates are sorted as 96 separate tubes or pooled into a single tube prior to sorting. Furthermore, at a minimum, the following details are requested for each optical enrichment "run". These details are critical considerations for those who seek to use optical enrichment in their own laboratories: • Seeding density • Time elapsed (in hours) between cell plating and optical enrichment • The number of fields of view examined • The median number of cells per field of view; the proportion of each plate's surface area that is imaged and photo-converted • The total time taken (in hours) to perform imaging and photoconversion • The gating protocol used for sorting by FACS (preferably including a figure with example gates for one or two experiments). The gating protocol is described for the genetic screen but not for the control experiments.

      -The authors use PA-mCherry. There are a variety of other photo-activatable fluorophores available, and it would be good for them to comment on why they chose PA-mCherry. Also, since the method is supposed to be used for generic pooled optical screens, it would be good for the authors to comment on what colors remain available for imaging cellular structures.

      -In general, the figures are hard to read, with most space being dedicated to beautiful but complex schematics/workflows. Points and fonts should be bigger, and the authors should consider revising the schematics to take up less space.

      -There is extensive use of editorialzing adverbs. Adverbs such as "highly" (abstract and page 15), "easily" (pages 4 and 11), "completely" (page 11), and "only" (page 12) are unnecessary at best and unsupported by the data at worst (e.g. cells are not "completely" separable with 100 ms photo-conversion, see page 11 and Figure 1C). Please remove "completely" from page 11 and consider removing other adverbs as well.

      -Apologies if I missed it, but I couldn't find a data availability statement. Sequencing reads from the experiments should be deposited in SRA or GEO and made available upon publication.

      Specific comments

      Pages 5/6 - The authors present experiments that show that optical enrichment is highly specific for desired cells. But, they should consider presenting precision (fraction of called positives that are true positive) and recall (fraction of all true positives that are called positive) instead. I think these relate more directly to a pooled optical screen than specificity.

      Page 6 - Related to the above point, the authors state "These results indicate the assay yields reliable hit identification regardless of the percentage of hits in the library." This statement seems too strong given that the authors looked at specificity experimentally with a mixture of ~1% mIFP positive cells. In fact, hits might be much less than 1% of the total population of cells, and specificity would certainly fall from the 80% measured at 1% of the total population. The authors should do a bit more to fairly discuss their ability to find rare hits.

      Pages 6/7 - The authors perform a validation experiment using two different sgRNA libraries, infecting mIFP- and mIFP+ cells separately. Then, they demix these populations via optical enrichment, sequence and compute a phenotype score for sgRNAs or groups of sgRNAs. The way the experiment is described and visualized is extremely confusing. If I understood correctly (and I am not sure that I did), the bottom right panel of Figure 2b shows that if sgRNAs are (randomly?) paired AND two replicates are combined then optical enrichment nearly perfectly separates all (combined, paired) sgRNAs in the two libraries. The authors should rewrite this section, especially clarifying what is meant by "1 sgRNA/group and 2 sgRNA/group," and consider changing Figure 2b (perhaps just show the lower right panel?).

      Page 8 - Related to Supplementary Figure 3, why are there not clear BFP+ and BFP- populations but instead one continuous population? How was the gating determined (e.g. how was the boundary between red and gray picked)? Here, and generally, flow plots and histograms of flow plots should indicate the number of cells. If replicates were performed, they should be included.

      Page 8 - "Nuclear sizes...". The authors should say in the main text what size metric was used.

      Page 9 - I am a little confused about the statistical analysis of the screen. In Supplementary File 1, the authors state that p-values were "calculated based on comparison between the distribution of all the phenotypic scores of sgRNAs targeting to the gene/assigning in the group and the one of negative control sgRNAs in the libraries." I presume this means that all phenotypic scores (across replicates) of all sgRNAs targeting each gene were included in a Mann Whitney U test with a single randomized set of phenotypic scores. If that's right, it seems like an odd way to get p-values. Better would be a randomization test, where a null distribution of phenotypic scores for each gene is built by randomizing sgRNA-level scores many times. Then the actual phenotypic score is compared to the randomized null distribution, yielding a p-value. In any case, the authors must clarify what they did in the main text and Supplementary File 1.

      Page 9 - It does not appear that the p-values presented in Figure 3c have been adjusted for multiple hypothesis testing. This should be done.

      Page 9 - "A value of the top 0.1 percentile of control groups was used as a cutoff for hits." Why? This seems arbitrary. It seems like appropriate false-discovery rate control would enable a more rigorous method for choosing a cutoff. Page 9 - The same comments regarding analysis and scoring of the optical enrichment screen applies to the FSC and GFP screens.

      Page 9 - "These data suggest that a direct measurement utilizing a microscope can provide significant improvement in hit yield even for phenotypes that could be indirectly screened with other approaches." I think this conclusion is too strong. It rests on the assumption that the FSC/GFP phenotypes should have the same set of hits as the microscope phenotype (larger nuclear area). This may not be the case. For example, genes whose inactivation increases GFP expression would be hits in the former, but not latter case. The authors should moderate this statement.

      Page 11 - "This is significantly faster than the in situ methods." The authors should provide a citation and an actual comparison to the speed of in situ methods.

      Page 12 - I think the authors could say a bit more about the possibility of low hit rate screens. How low do they think it is feasible to go? What hit rates are expected based on existing arrayed optical screens?

      Page 14 - It is weird that the discussion includes a fairly important couple of paragraphs that seem to belong in the results (e.g. the text surrounding Figure 4b and c). Obviously, I don't want to prescribe stylistic changes, but I suggest the authors consider moving this description of the experiments/analyses to the results.

      Page 14 - The authors validate their hits individually, and observe that expression of hit sgRNAs does increase nuclear size in some cells. But, many/most cells remain control-like in these validation experiments. The authors should comment on why this is the case (e.g. inefficient knockdown, cell cycle effects, etc).

      Page 14 - It would be nice to formally compare the control and sgRNA distributions in each panel of 4a and Supplementary Figure 5 (e.g. with a Komolgorov-Smirnov test, etc). That would allow a more precise statement to be substituted for "14 out of 15 hits (the exception was TACC3) were confirmed to be real hits, with cells exhibiting larger nuclei after knock down (Fig. 4a and Fig. S5)," which is not quantitative.

      Figure 2a - I am not sure it is necessary to show the entire workflow again. The first and possibly last panels are the informative ones here.

      Figure 3a - Same comment as above - these workflow panels take up a lot of real estate and I suggest simplifying them if possible.

      Figure 3c - At least on my PDF/screen, the "scrambled control" points appear very light gray and are impossible to find. They should be an easier to spot color.

      Figure 4b - "Most cells developed a larger cellular size and higher H2B-mGFP level after knock down." I think it would be more accurate to say that the median cell size/GFP level increased, or that some cells developed larger sizes/median GFP levels.

      Figure 4c - I don't understand "Normalized FITC/nuclear size." Do the bars show the mean/median of a population (if so, why not show a dot plot or box plot or violin plot)? Also, what is FITC (I presume it's GFP levels)?

      Figure 4c - "Most cells maintained a constant ratio between nuclear size and DNA content..." I'm not sure where DNA content came from. Are the authors assuming that their H2B-mGFP is a proxy for DNA content? Or was some other measurement made? If the former, is there a citable reason why this is a good assumption?

      Significance

      I don't generally comment on significance in reviews. Since ReviewCommons is specifically asking, I'll say that this manuscript describes optical enrichment, a method that is an extension of previous work and is substantially similar to a previously published method, Visual Cell Sorting. However, given the timing, it is obvious that these authors have been working independently on optical enrichment. Since the application is distinct, and optical enrichment incorporates some nice features like software to make it easier to execute, it is clearly of independent value.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      In this manuscript Yan et al describe a method to perform imaging based pooled CRISPR screens based on photoactivation followed by selection and sorting of the cells with the desired phenotypes. They establish a system in mammalian RPE-1 cells where they integrate a photo-activatable mCherry, identify the cells of interest under the microscope based on a phenotype, automatically activate the mCherry fluorescence in these cells and then sort the desired populations by FACS. They demonstrate the reliability of their enrichment method and finally use this approach to look for factors that regulate nuclear size by a targeted pooled CRISPR screen.

      Major points:

      1.This year Hassle et al described a very very similar approach that they name: Visual Cell Sorting . In this case, they use a photoconvertible fluorescent protein (green-to-red conversion) to select cells with a certain visual cellular phenotype and enrich those by FACS. The Hassle et al 2020 MSB paper is only mentioned together with the other methods in the introduction in one sentence (ref #19 in this manuscript):

      " Recently, several in situ sequencing15,16 and cell isolation methods17-20 were developed which allow microscopes to be used for screening. However, these methods contain non-high throughput steps that limit their scalability."

      I think the current citation of the Hassle et al paper, is not really fair. The idea and the execution of the two approaches are almost exactly the same. Here, the authors concentrate on a CRISPR based application, but obviously the applications of the method are not limited to that. The authors should discuss how these similar ideas can be used in several different applications.

      1. While I understand that the authors mean conversion from the dark state to fluorescent state when they describe their photo-activatable mCherry, I think the term "photo-activation" can be confusing for the general reader since typically photo-conversion refers to a change in color. I would here suggest stick to the term photo-activation.
      2. For validation of the hits coming from the nuclear size screen: Did the authors have any controls making sure that the right targets were down-regulated? This might be obvious for some of the targets (e.g. CPC proteins that are known to induce division errors display the nuclear fragmentation that the authors also observe) but especially for the ones that are less known or unknown to induce any nuclear size change, it will be important to demonstrate the specificity of the targets. In addition, it is not clear from the figure legends and the material and methods if these phenotypes are verified by 3-4 gRNAs they use in the validation. Are the histograms representative of a single experiment with one gRNA or a combination of gRNAs in different experiments? Methods of replication of the data presented in Fig4 is unclear.

      Minor points:

      1. Related to major point #3: I could not find much experimental info on how the hits from the screen were verified in materials and methods.
      2. The legend of Figure 4c is not describing what the plot is showing. Instead it tells the readers the authors' interpretation of the data.
      3. Figure S1b there is a typo

      Significance

      I think the idea of performing pooled screens coupled to microscopy is exciting and this approach has definitely more potential than the Craft-ID approach that the authors also discuss in their manuscript. In addition, the approach that is described in this manuscript is convincing and although the fact that the analysis part will require more work (to adapt the software to recognise different types of phenotypic readouts) in the future to make it accessible to the scientific community, the authors present sufficient evidence that the system can be robust. They also present some clever ideas such as to calculate enrichments with different photo-activation times (2sec vs 100ms) followed by separation of these populations by FACS.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      We would like to express our upmost gratitude to the three anonymous reviewers for their constructive and insightful comments on our manuscript. We broadly agree with all comments made and have uploaded a preliminary revised version with changes highlighted in bold. We now deal with each of the reviewer comments in turn.

      Reviewer #1

      L50-52: Can you predict where the unmapped read came from? Could viral infections be the source as in land plants?

      Having done a crude examination of unmapped reads, we couldn't find compelling evidence of them being of viral origin. The unmapped fraction in fact was in the same region as seen for other sRNA libraries in our lab which we found to occur for a number of reasons such as sequencing errors, incomplete assembly, differences between the sequenced lines and the reference line. Those all result in unmapped reads, which is also cause by since we employed a stringent mapping (0 mismatches).

      L67-68, which is the explanation?

      Thank you for querying this. After much closer inspection of the papers cited by Casas-Mollano et al. as evidence of the 23nt peak the evidence for the 23nt doesn't seem that strong and may even be a mistake on their part. Nonetheless, it is far from a critical piece of information for this paper and we have thus decided to remove this sentence.

      Fig 1D the reference to the A,C,G,U 5' should be re-positioned within Figure 1D panel space.

      Thanks, this has been addressed.

      Figure 3: it could be a supplementary figure based on the relevance given in the manuscript to this point.

      We agree, and have moved Fig3 to Supplement.

      *P5, line 107: while commenting on strand bias there seems to be a mistake in strong bias definition, it should be x 0.8, not "strong bias (0.2

      Thank you for pointing this out, we have now corrected this error. We have duly corrected it in the text.

      P5, line 110: marked changes regarding locus size are not as striking in my opinion, in particular log size 6 and following, which is not marked in the graph (the cut off between 6 and 8). Maybe this curve should be split into two distribution graphs based on some important features (as repetitiveness?) that might allow a better definition of cut-offs.

      Thank you for pointing this out. You are correct that the changes in the density distribution are not as striking for locus size. A great deal of deliberation on our part went into deciding what to do about this. In the end, we decided that for the size classes there was benefit in having several different classes with the understanding that having additional potentially redundant cut-offs would not adversely effect the analysis. In doing this, we were partially driven by the albeit subtle changes in the curve, but also by the desire to have size classes that were biologically relevant and informative. For example, a locus 3000nt captures the long tail. However, we neglected to fully explain these subtleties in our decision-making, something we have now rectified through some added explanation in the text. These choices were validated by the way size classes are differentially associated with different locus clusters in Figure 8.

      Fig 5: the legend has the C subfigure twice, the second should be D.

      Thank you for highlighting this. It has now been corrected.

      Table 1: I believe the data would be better presented in a plot, potentially something similar to the plot in Figure 1 A and B. The numbers are already presented in the supplementary spreadsheet.

      Thanks for pointing this out. We agree with this suggestion and have replaced Table 1 with a Figure (Fig 5) which is indeed a better way to present those results.

      Fig 6A: The boxplots regarding Stability of the clusters should be better described. What exactly does the y-axis in each "small plot" represent?

      Thank you for pointing this out, we understand that this isn't clear at the moment. Briefly, for this analysis we performed the clustering multiple times each time with a random sample of the loci (with replacement) of the same size as the original dataset. We then calculated the proportion of loci that retained their original clustering. We have clarified this in the figure legend and also elaborated on the approach in the methods section to ensure that it is better described.

      P6, line 142: analyses of stability and variance shows 7 as the optimal k, while gap statistics and NMI suggested 6 as the optimal. It is not clear why 6 was preferred. The MCA section in Methods is unclear regarding this point too.

      Thank you for querying this. The process of choosing the appropriate value of k is a complicated one and we appreciate that the explanation could be clearer. After your comment, we re-visited our decision-making process and were reassured that a k value of 6 rather than 7 was indeed appropriate. The stability plots in Fig. 6A start with k=2 and it can be clearly seen for k=6 that stability is comparatively high for dimensions 7-10. Indeed, k values of 2,3 and 6 seem to be the only feasible values. k=7 is fairly unstable for all dimensions from 1-8. We have done some rewording of the methods to hopefully make this clearer.

      Fig S2-S5: please check legends, they are identical, although they should cover examples of loci in LC2 through LC5. These figures are not cited in the text, only S1 and S2.

      Thanks for pointing this out. This is now corrected and we have referenced all figures in the main text.

      Fig 9: I suggest using different colors in density plots to ease interpretation. LC tracks could share a color and Gene, TEs, DNA meth, and All loci should have a different color each.

      A good suggestion - this has been replotted with different colours.

      Supplementary Files S1: The full-annotated locus map should be provided as a spreadsheet file or as a text (.csv) file, not as a pdf file.

      Thanks for pointing this out. We originally submitted this file as a gff format. We are not sure why this got converted. We will make sure this is going to be in appropriate format in the final form, especially having suffered from the pains of pdf tables ourselves in the past.

      I may be misunderstanding Fig. 6E, but it looks strange that the observed sum-of-squares is smooth, but the expected is not. Is it possible that the in-figure reference is inverted?

      Indeed, the colours were inverted. Thanks a lot for that spot, we have now swapped them around.

      Reviewer #2

      I am concerned that the methodology used does not adequately distinguish small RNA loci that are attributable to random RNA degradation products from loci that are truly fit the DCL / AGO paradigm. I think this is critical to maximize the utility of the annotations for the community. This issue was not directly addressed in the current version of the manuscript. There is cause for concern: 64% of the annotations overlap with protein-coding genes (lines 116-117), 55% with exons (line 118), and 41% of loci show strong strand bias (lines 123-124). These are all associations expected for breakdown products of mRNAs. Furthermore, only 11% of the loci were found to be dependent on CrDCL3 (line 123). Small RNA sequencing data from the other 2 DCL mutants are not yet available (line 211). One way that has been effective in angiosperms is to track the proportion of "DCL-sized" RNAs within all RNAs from each locus. Loci comprised of random degradation products will be single-stranded, generally touching exons, and have a very wide size distribution. In contrast, loci where the small RNAs are truly created by a DCL protein will have a very narrow size distribution. In any event, I think a strong effort to identify and flag small RNA loci that are less likely to be DCL / AGO silencing RNAs, and more likely to be degradation products, would be an important change to this study.

      Thank you for this very insightful comment which has helped us to reflect on the methodological approach. While it is likely that there are some RNA breakdown products picked-up in the sRNA sequencing, we do not think that the locus-map as a whole is undermined by this. For example 54% of loci have a predominance for 21-nt sRNAs and 18% for 20-nt sRNAs, so the majority of sRNA loci do have a predominance for a specific RNA size.

      However, your point does raise a very valid concern with implications for the interpretation of LC4. Although we posit some explanations for these loci (e.g. DCL-mediated sRNA production without an accessory protein to provide PAZ domain-like sRNA measurement), given the very strong strand bias and association with genic regions we do agree that there is a risk that these loci predominantly represent degradation fragments. Therefore, we have now reworded how we discuss LC4 in the discussion to reflect this. This also reveals a key advantage of the clustering approach in that should LC4 indead represent degradation products, they have been successfully grouped together into a seperate cluster such that they don't undermine the insights gained from the other locus clusters.

      One of the key results likely to be used by others is the final GFF3 file (Sup File S1). The Description fields in this file are extremely verbose. Do these load well on a genome browser? I suggest it might be good to store most of the information currently in the Description field in a separate flat file, and limit the GFF3 descriptions to key information (locus name, the LC group).

      Thank you for pointing this out. In a pursuit to share as many details as possible, we appreciate that this can be too verbose, as righlfully noticed here. In order to not compromise detail too much, we have created a second, toned down, version as csv which now includes essential details such as name, position and LC. As for the gff, we kept all details in since it loads quickly in a genome browser, but also into other tools such R in which those feature can be used as efficient filters.

      Sup Table S1 would be much more useful for future researchers if it had a column with the direct accession numbers for the raw sequencing libraries.

      We have included another table which includes direct accession number for ENA as well as numerous other meta data in Sup Table S6 i.e. "Supp_Table_S6_library_ENA_accession"

      Figures showing genome browser snapshots are too small; the text is mostly illegible on screen and when printed. This includes Figure 4 and Figures S1-S5.

      The snapshots have been improved to ensure better readability.

      Lines 67-68: This is unclear to me. Did the authors do Northerns? Please clarify / re-write.

      Thank you for querying this. After much closer inspection of the papers cited by Casas-Mollano et al. as evidence of the 23nt peak the evidence for the 23nt doesn't seem that strong and may even be a mistake on their part. Nonetheless, it is far from a critical piece of information for this paper and we have thus decided to remove this sentence.

      Figure 2B: X-axis label, perhaps change to "number of reads in library" for clarity.

      We agree and have changed it accordingly

      Figure 4 caption: The acronym "CRSL" should be defined.

      CRSL is now been duly defined in the manuscript

      Line 387: Reference #29 (line 509): There is not enough information here to find the data.

      We have used the appropriate bibtex code to reference this Zenodo share (https://zenodo.org/record/3862405/export/hx). The current cite format does somehow omit some information. We hope this will be fixed by the publisher but we have also provided the full DOI address in the “additional information” section just in-case. We will keep an eye on how it comes out.

      Style suggestion on title: What is "secret" about the genome? I didn't really understand that first part of the title. Perhaps consider revision to make it more factual and less literary. Just "A small RNA locus map for Chlamydomonas reinhardtii"?

      Thank you for this suggestion, we have adapted the title to make it more descriptive.

      Reviewer #3

      …the evolutionary implications are not clear. The authors state in the abstract that "These results are consistent with the idea that there was diversification in sRNA mechanisms after the evolutionary divergence of algae from higher plant lineages." Although in the end this may prove to be correct, the only species compared are Arabidopsis thaliana (as representative of land plants) and Chlamydomonas reinhardtii (as representative of green algae). With this very limited information it is not possible to infer the sRNA loci (much less sRNA mechanisms) in an ancestral species. It remains formally possible that an ancestral progenitor species had a greater diversity of sRNA loci that were subsequently lost in a selective manner in specific lineages. Moreover, the diversity of sRNA loci may not correlate strictly with the diversity of the RNAi machinery since, at least some loci, do not appear to be associated with RNAi components such as Dicer or Argonaute.

      Thank you for these insightful comments. As we followed a very similar methodological approach to that used to produce the Arabidopsis sRNA locus map published in Hardcastle et al. (2018), we wanted to take the opportunity to compare the results and build upon the ongoing discussion concerning the evolution of sRNA mechanisms in Chlamydomonas (e.g. Valli et al. 2016). Your point about the possibility of an ancestral progenitor with greater diversity that was then lost is very valid. You are also of course correct about the limitations to what can be concluded from this study and the limited comparisons that can be made. We see our approach as a useful tool for hypothesis generation which can be complemented by more in-depth exploration in the future. With this in mind, and taking on board your comments, we have elaborated on our discussion of the evolutionary implications of our study, which we hope now gives a more balanced account.

      I may have missed it but I could not find a table listing the specific sRNA loci assigned to each of the locus classes. It would be very useful to provide the class annotation of each sRNA locus in order to facilitate future analyses of sRNA biogenesis and function.

      That information was indeed missing, thanks for bringing it up. We have now included this in the gff file (column LC) as well as in another cleaner table (Supp_Table_S7_loci_class_annotation).

      Figures S2 to S5 have the same legend but they correspond to different loci. It would be useful to provide for each locus class, as supplementary figures, two examples of typical sRNA loci.

      Thanks for pointing this out, this was an error on our part, the captions have now been corrected. Unfortunately, due to the ongoing pandemic-related restrictions we were unable to run to get a genome browser session to run to this point to create more loci figures.

      If information is available, the paper would be strengthened by some locus class validation based on features not used to generate the classification.

      Thank you for this suggestion. In fact, not all annotation features were used predictively in the MCA and clustering process, and so these "supplementary" annotations as outlined in supplementary table S3 can provide some cross-validation. With that in mind, we have now included an additional heatmap as a supplementary figure which shows associations for some of these supplementary annotations as well as corresponding explanations in the text. Further validation is provided by the chromosome tracks in figure 9 showing the distinct genomic distributions of each locus cluster despite chromosomal location not being a factor in the clustering.

      Pg 5, line 108. I think you mean "strong bias (0.2 > x > 0.8)."

      Thank you for pointing this out, we have now corrected this error.

      Pg 7, Table 1. Some of the annotation features are obvious but some abbreviations may need clarification using footnotes.

      Table 1 has been replaced by the new Fig 5, annotation/abbreviations should now be more obvious.

      Pg 8, lines 156-157. This sentence is not clear. Additionally, the legends to Figures S2-S5 do not refer to LC2 paragon (CSRL003890).

      Thank you for pointing this out. We have now moved the reference to the paragons to earlier in the section where we introduce the six clusters. We hope this is now clearer.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      This manuscript presents a detailed map of sRNA (precursor) loci in the green alga Chlamydomonas reinhardtii based on large volumes of sequencing data (145 sRNA libraries). The locus map based on a false discovery rate of less than 0.05 had 6164 loci, covering 4.1% of the Chlamydomonas reference genome. Individual loci were annotated based on both intrinsic features, such as sRNA size, 5'-nucleotide, strand bias and phasing pattern, and extrinsic features, such as sRNA expression, genotype and overlap with genomic attributes (e.g., genes, transposons, methylation levels).

      By using the intrinsic and extrinsic features of each sRNA locus and Multiple Correspondence Analysis (MCA) approaches, the sRNA loci were clustered into six distinct classes, referred to as locus class (LC) 1-6. This strategy is partly validated by the grouping of well-characterized Chlamydomonas miRNAs into the same cluster, LC3.

      As the authors state, this data-driven approach is valuable for hypothesis generation since (with the possible exception of LC3) the biogenesis and function of most sRNA loci (and of the corresponding locus classes) remain uncharacterized in Chlamydomonas. The analysis provides a framework to facilitate future characterization of the diverse types of sRNAs in this model algal system.

      However, the evolutionary implications are not clear. The authors state in the abstract that "These results are consistent with the idea that there was diversification in sRNA mechanisms after the evolutionary divergence of algae from higher plant lineages." Although in the end this may prove to be correct, the only species compared are Arabidopsis thaliana (as representative of land plants) and Chlamydomonas reinhardtii (as representative of green algae). With this very limited information it is not possible to infer the sRNA loci (much less sRNA mechanisms) in an ancestral species. It remains formally possible that an ancestral progenitor species had a greater diversity of sRNA loci that were subsequently lost in a selective manner in specific lineages. Moreover, the diversity of sRNA loci may not correlate strictly with the diversity of the RNAi machinery since, at least some loci, do not appear to be associated with RNAi components such as Dicer or Argonaute.

      Some specific comments:

      1.I may have missed it but I could not find a table listing the specific sRNA loci assigned to each of the locus classes. It would be very useful to provide the class annotation of each sRNA locus in order to facilitate future analyses of sRNA biogenesis and function.

      2.Figures S2 to S5 have the same legend but they correspond to different loci. It would be useful to provide for each locus class, as supplementary figures, two examples of typical sRNA loci.

      3.If information is available, the paper would be strengthened by some locus class validation based on features not used to generate the classification.

      4.Pg 5, line 108. I think you mean "strong bias (0.2 > x > 0.8)."

      5.Pg 7, Table 1. Some of the annotation features are obvious but some abbreviations may need clarification using footnotes.

      6.Pg 8, lines 156-157. This sentence is not clear. Additionally, the legends to Figures S2-S5 do not refer to LC2 paragon (CSRL003890).

      Significance

      Chlamydomonas reinhardtii is a model unicellular green alga, the lineage of which diverged from land plants approximately one billion years ago. Chlamydomonas encodes a great number of diverse small RNAs. However, the biogenesis and function of the majority of these sRNAs are not known. By grouping sRNA loci into specific classes (based on intrinsic and extrinsic features), this manuscript provides a framework that will facilitate the future characterization of sRNAs in Chlamydomonas and, very likely, in other algal species. This information may also contribute to our understanding of the evolution of sRNA loci within eukaryotes.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary:

      This manuscript describes the annotation of small RNA-prodicing loci from the green alga Chlamydomonas reinhardtii. A large number of small RNA-sequencing datasets were anlayzed and used to create genome-wide annotations of small RNA-producing loci. These loci were annotated based on several features, and then classified into six major groups based on these features.

      Major comments:

      Are the key conclusions convincing? --> Yes.

      Should the authors qualify some of their claims as preliminary or speculative, or remove them altogether? --> No

      Would additional experiments be essential to support the claims of the paper? Request additional experiments only where necessary to evaluate the paper as it is, and do not ask authors to open new lines of experimentation. --> Yes, additional analyses should be conducted, see itemized list below.

      Are the suggested experiments realistic for the authors? It would help if you could add an estimated cost and time investment for substantial experiments. --> Perhaps a few weeks to a month of analysis and revision time.

      Are the data and the methods presented in such a way that they can be reproduced? --> Yes.

      Are the experiments adequately replicated and statistical analysis adequate? --> Yes.

      SPECIFIC COMMENTS:

      1.I am concerned that the methodology used does not adequately distinguish small RNA loci that are attributable to random RNA degradation products from loci that are truly fit the DCL / AGO paradigm. I think this is critical to maximize the utility of the annotations for the community. This issue was not directly addressed in the current version of the manuscript. There is cause for concern: 64% of the annotations overlap with protein-coding genes (lines 116-117), 55% with exons (line 118), and 41% of loci show strong strand bias (lines 123-124). These are all associations expected for breakdown products of mRNAs. Furthermore, only 11% of the loci were found to be dependent on CrDCL3 (line 123). Small RNA sequencing data from the other 2 DCL mutants are not yet available (line 211). One way that has been effective in angiosperms is to track the proportion of "DCL-sized" RNAs within all RNAs from each locus. Loci comprised of random degradation products will be single-stranded, generally touching exons, and have a very wide size distribution. In contrast, loci where the small RNAs are truly created by a DCL protein will have a very narrow size distribution. In any event, I think a strong effort to identify and flag small RNA loci that are less likely to be DCL / AGO silencing RNAs, and more likely to be degradation products, would be an important change to this study.

      MINOR COMMENTS:

      2.One of the key results likely to be used by others is the final GFF3 file (Sup File S1). The Description fields in this file are extremely verbose. Do these load well on a genome browser? I suggest it might be good to store most of the information currently in the Description field in a separate flat file, and limit the GFF3 descriptions to key information (locus name, the LC group).

      3.Sup Table S1 would be much more useful for future researchers if it had a column with the direct accession numbers for the raw sequencing libraries.

      4.Figures showing genome browser snapshots are too small; the text is mostly illegible on screen and when printed. This includes Figure 4 and Figures S1-S5.

      5.Lines 67-68: This is unclear to me. Did the authors do Northerns? Please clarify / re-write.

      6.Figure 2B: X-axis label, perhaps change to "number of reads in library" for clarity.

      7.Figure 4 caption: The acronym "CRSL" should be defined.

      8.Line 387: Reference #29 (line 509): There is not enough information here to find the data.

      9.Style suggestion on title: What is "secret" about the genome? I didn't really understand that first part of the title. Perhaps consider revision to make it more factual and less literary. Just "A small RNA locus map for Chlamydomonas reinhardtii"?

      Significance

      Describe the nature and significance of the advance (e.g. conceptual, technical, clinical) for the field.:

      This study provides a genome-wide annotation of small RNA-producing loci from Chlamydomonas reinhardtii. This will serve as a use data resource for researchers working with this model system. The results overall confirm what is known from previous studies of Chlamy small RNAs : They are rather distinct from angiosperm small RNAs and from animal small RNAs.

      Place the work in the context of the existing literature (provide references, where appropriate).:

      This may be the first study to provide a genome-wide annotation (as opposed to a focused effort) for Chalmy small RNA populations.

      State what audience might be interested in and influenced by the reported findings:

      Chlamy researchers, especially those interested in gene silencing and genome annotations, and small RNA specialists with interest in annotations and in wide phylogenetic comparisons.

      Define your field of expertise with a few keywords to help the authors contextualize your point of view. Indicate if there are any parts of the paper that you do not have sufficient expertise to evaluate. :

      Plant microRNAs, siRNAS, genetics, and genomics.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      In this study, Müller, Matthews, Vali, and Baulcombe have used data-driven machine learning approaches to annotated and classified sRNA loci of Chlamydomonas reinhardtii. I have found the manuscript very interesting and a handy handbook for the appropriate way to annotate sRNA loci in different organisms. I believe this is not only a great resource paper on its own, but it also contains essential information to start understanding how Chalmydomonas silence TEs without a RdDM pathway. I have a few comments that may help to improve the manuscript.

      -L50-52: Can you predict where the unmapped read came from? Could viral infections be the source as in land plants? -L67-68, which is the explanation?

      • Fig 1D the reference to the A,C,G,U 5' should be re-positioned within Figure 1D panel space. -Figure 3: it could be a supplementary figure based on the relevance given in the manuscript to this point. -P5, line 107: while commenting on strand bias there seems to be a mistake in strong bias definition, it should be x < 0.2 and x > 0.8, not "strong bias (0.2 < x < 0.8)", as in the text. -P5, line 110: marked changes regarding locus size are not as striking in my opinion, in particular log size 6 and following, which is not marked in the graph (the cut off between 6 and 8). Maybe this curve should be split into two distribution graphs based on some important features (as repetitiveness?) that might allow a better definition of cut-offs.
      • Fig 5: the legend has the C subfigure twice, the second should be D.
      • Table 1: I believe the data would be better presented in a plot, potentially something similar to the plot in Figure 1 A and B. The numbers are already presented in the supplementary spreadsheet.
      • Fig 6A: The boxplots regarding Stability of the clusters should be better described. What exactly does the y-axis in each "small plot" represent?
      • P6, line 142: analyses of stability and variance shows 7 as the optimal k, while gap statistics and NMI suggested 6 as the optimal. It is not clear why 6 was preferred. The MCA section in Methods is unclear regarding this point too.
      • Fig S2-S5: please check legends, they are identical, although they should cover examples of loci in LC2 through LC5. These figures are not cited in the text, only S1 and S2. -Fig 9: I suggest using different colors in density plots to ease interpretation. LC tracks could share a color and Gene, TEs, DNA meth, and All loci should have a different color each. -Supplementary Files S1: The full-annotated locus map should be provided as a spreadsheet file or as a text (.csv) file, not as a pdf file. -I may be misunderstanding Fig. 6E, but it looks strange that the observed sum-of-squares is smooth, but the expected is not. Is it possible that the in-figure reference is inverted?

      Significance

      This is a very interesting aticle. It may looks a little bit technical but is provide useful information for people studying Chlamydomonas. In addition, the way the authors approached the annotation of sRNA is very meticulous and elegant. I would suggest people exploring small RNAs in non-model organisms to use this article as a handbook of how to annotate sRNAs. In this particular way the artivle will be of interest beyong the Chlamydomonas, and event plant, research field.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer1

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      The manuscript is clearly written and the figures appropriate and informative. Some descriptions of data analyses are a little dense but reflect what would appear long hard efforts on the part of the authors to identify and control for possible sources of misinterpretation due to sensitivities of parameters in their fitness model. The authors efforts to retest interactions under non-competition conditions allay fears of most concerns that I would have. One problem though that I could not see explicitly addressed was that of potential effects of interactions between methotrexate and the other conditions and how this is controlled for. Specifically, I could be argued that the fact that a particular PPI is observed under a specific condition could have more to do with a synthetic effect of treatment of cells with a drug plus methotrexate. Is this controlled for and how? I raise this because in a chemical genetic screen for fitness it was shown that methotrexate is particularly promiscuous for drug-drug interactions (Hillenmeyer ME ,et al. Science 2008). I tried to think of how this works but couldn't come up with anything immediately. I'd appreciate if the authors would take a crack at resolving this issue. Otherwise I have no further concerns about the manuscript.

      We thank the reviewer for the kind comments. We agree with the reviewer’s point that methotrexate could be interacting with drugs or other perturbagens, similar to how the chosen nitrogen source, carbon source, or other growth conditions may interact with a drug. However, the methotrexate concentration is held constant across all conditions, as is the rest of the media components such as the nitrogen and carbon source (with the exception of the raffinose perturbation). Any interactions with methotrexate, or other media components, is undetectable without systematically varying all components for all stressors. Therefore, we use the typical experimental design of measuring molecular variation from a reference, holding invariant media components (such as methotrexate, glucose, or vitamins) fixed between conditions. This is a general practice, and we describe that every condition contains methotrexate on page 3, line 10.

      The library was grown under mild methotrexate selection in 9 environments for 12-18 generations in serial batch culture, diluting 1:8 every ~3 generations, with a bottleneck population size greater than 2 x 109 cells (Table S1).

      We also list the full details of each environment in Table S1.

      Reviewer #1 (Significance (Required)):

      Lui et al expand on previous work from the Levy group to explore a massive in vivo protein interactome in the yeast S. cerevisiae. They achieve this by performing screens cross 9 growth conditions, which, with replication, results in a total of 44 million measurements. Interpreting their results based on a fitness model for pooled growth under methotrexate selection, they make the key observation that there is a vastly expanded pool of protein-protein interactions (PPI) that are found under only one or two condition compared to a more limited set of PPI that are found under a broad set of conditions (mutable versus immutable interactors). The authors show that this dichotomy suggests some important features of proteins and their PPIs that raise important questions about functionality and evolution of PPIs. Among these are that mutable PPIs are enriched for cross-compartmental, high disorder and higher rates of evolution and subcellular localization of proteins to chromatin, suggesting roles in gene regulation that are associated with cellular responses to new conditions. At the same time these interactions are not enriched for changes in abundance. These results are in contrast to those of immutable PPIs, which seem to form a core background noise, more determined by changes in abundance than what the authors interpret must be post-translational processes that may drive, for instance, changes in subcellular localization resulting in appearance of PPIs under specific conditions. The authors are also able to address a couple of key issues about protein interactomes, including the controversial Party-date Hub hypothesis of Vidal, in which they could now affirm support for this hypothesis based on their results and notably negative correlation of PPIs to protein abundance for mutable PPIs. Finally, they also addressed the problem of predicting the upper limit of PPIs in yeast, showing the remarkable results that it may be no more than about 2 times the number of proteins expressed by yeast. Such an upper limit is profoundly important to modelling cellular network complexity and, if it holds up, could define a general upper limit on organismal complexity.

      This manuscript is a very important contribution to understanding dynamics of molecular networks in living cells and should be published with high priority.

      Reviewer 2

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      Report on Liu et al. "A large accessory protein interactome is rewired across environments"

      Liu et al. use a mDHFR-based, pooled barcode sequencing / competitive growth / mild methotrexate selection method to investigate changes of PPI abundance of 1.6 million protein pairs across different 9 growth conditions. Because most PPI screens aim to identify novel PPIs in standard growth conditions, the currently known yeast PPI network may be incomplete. The key concept is to define immutable" PPIs that are found in all conditions and "mutable" PPIs that are present in only some conditions.

      The assay identified 13764 PPIs across the 9 conditions, using optimized fitness cut offs. Steady PPI i.e. across all environments, were identified in membrane compartments and cell division. Processes associated with the chromosome, transcription, protein translation, RNA processing and ribosome regulation were found to change between conditions. Mutable PPIs are form modules as topological analyses reveals.

      Interestingly, a correlation on intrinsic disorder and PPI mutability was found and postulated as more flexible in the conformational context, while at the same time they are formed by less abundant proteins.

      I appreciate the trick to use homodimerization as an abundance proxy to predict interaction between heterodimers (of proteins that homodimerize). This "mass-action kinetics model" explains the strength of 230 out of 1212 tested heterodimers.

      A validation experiment of the glucose transporter network was performed and 90 "randomly chosen" PPIs that were present in the SD environment were tested in NaCl (osmotic stress) and Raffinose (low glucose) conditions through recording optical density growth trajectories. Hxt5 PPIs stayed similar in the tested conditions, supported by the current knowledge that Hxt5 is highly expressed in stationary phase and under salt stress. In Raffinose, Hxt7, previously reported to increase the mRNA expression, lost most PPIs indicating that other factors might influence Hxt7 PPIs.

      **Points for consideration:**

      *) A clear definition of mutable and immutable is missing, or could not be found e.g. at page 4 second paragraph.

      We thank the reviewer for pointing this out. We have now added better definition of mutable and immutable on line 19 page 4:

      We partitioned PPIs by the number of environments in which they were identified and defined PPIs at opposite ends of this spectrum as “mutable” PPIs (identified in only 1-3 environments) and “immutable” (identified in 8-9 environments).

      *) Approximately half of the PPIs have been identified in one environment. Many of those mutable PPIs were detected in the 16{degree sign}C condition. Is there an explanation for the predominance of this specific environment? What are these PPIs about?

      The reviewer is correct that ~40% of the PPIs identified in only one environment were found in the 16 ℃ environment. One reason for this could be technical: the positive predictive value (PPV) is the lowest amongst the conditions (16 ℃: 31.6%, mean: 57%, Table SM6). It must be noted, however, that PPVs are calculated using reference data that has generally been collected in standard growth conditions. So, it might be expected that the most divergent environment from standard growth conditions (resulting in the most differences in PPIs) would result in a lower PPV in our study even if the true frequency of false positives was equivalent across environments. We have attempted to be transparent about the quality of the data in each environment by reporting PPVs and other metrics in Table SM6. However, we suspect that the large number of PPIs unique to 16 ℃ is due in part to the fact that it causes the largest changes in the protein interactome, and believe that it should be included, even at the risk of lowering the overall quality of the data. The main reason for this is that this data is likely to contain valuable information about how the cell copes with this stress. For example, we find, but do not highlight in the manuscript, that 16 ℃-specific PPIs contain two major hubs (DID4: 285 PPIs involved in endocytosis and vacuolar trafficking, and DED1: 102 PPIs involved in translation), both of which are reported to be associated with cold adaptation in yeast (Hilliker et al., 2011; Isasa et al., 2015).

      To assess whether the potentially higher false-positive rate in 16 ℃ could be impacting our conclusions related to PPI network organization and features of immutable and mutable PPIs, we repeated these analyses leaving out the 16 ℃ data and found that our main conclusions did not change. This new analysis is now presented in Figure S8 and described on page 5, line 10.

      Finally, we used a pair of more conservative PPI calling procedures that either identified PPIs with a low rate of false positives across all environments (FPR

      We have also added references to other panels in Figure S8 throughout the manuscript, where appropriate.

      *) 50 % overall retest validation rate is fair and reflects a value comparable to other large-scale approaches. However what is the actual variation, e.g. between mutable PPIs and immutable or between condition. e.g. at 16{degree sign}C.

      We validated 502 PPIs present in the SD environment and an additional 36 PPIs in the NaCl environment. As the reviewer suggests, we do indeed observe differences in the validation rate across mutability bins. This data is reported in Figures 3B and S6B, and we use this information to provide a confidence score for each PPI on page 5, line 4.

      To better estimate how the number of PPIs changes with PPI mutability, we used these optical density assays to model the validation rate as a function of the mean PPiSeq fitness and the number of environments in which a PPI is detected. This accurate model (Spearman's r =0.98 between predicted and observed, see Methods) provided confidence scores (predicted validation rates) for each PPI (Table S5) and allowed us to adjust the true positive PPI estimate in each mutability bin. Using this more conservative estimate, we still found a preponderance of mutable PPIs (Figure S6E).

      The validation rate in NaCl is similar to SD (39%, 14/36), suggesting that validation rates do not vary excessively across environments. Because validation experiments are time consuming (we performed 6 growth experiments per PPI), performing a similar scale of validations in all environments as in SD would be resource intensive. Insead, we report a number of metrics (true positive rate, false positive rate, positive predictive value) in Table SM6 using large positive and random reference sets. We believe these metrics are sufficient for readers to compare the quality of data across environments.

      *) What is the R correlation cutoff for PPIs explained in the mass equilibrium model vs. not explained?

      We do not use an R correlation cutoff to assess if a PPI is explained by the mass-action equilibrium model. We instead rely on ordinary least-squares regression as detailed in the methods on page 68, line 13.

      ...we used ordinary least-squares linear regression in R to fit a model of the geometric mean of the homodimer signals multiplied by a free constant and plus a free intercept. Significantly explained heterodimer PPIs were judged by a significant coefficient (FDR 0.05, single-test). This criteria was used to identify PPIs for which protein expression does or does not appear to play as significant of a role as other post-translational mechanisms.

      The first criterion identifies a quantitative fit to the model of variation being related. The second criterion is used to filter out PPIs for which the relationship appears to be explained by more than just the homodimer signals. This approach is more stringent, but we believe this is the most appropriate statistical test to assess fit to this linear model.

      *) 90 "randomly chosen" PPIs for validation. It needs to be demonstrated that these interaction are a random subset otherwise is could also mean cherry picked interactions.

      We selected 90 of the 284 glucose transport-related PPIs for validation using the “sample” function in R (replace = FALSE). We have now included text that describes this on page 63, line 3 in the supplementary methods:

      Diploids (PPIs) on each plate were randomly picked using the “sample” function in R (replace = FALSE) from PPIs that meet specific requirements.

      *) Figure 4 provides interesting correlations with the goal to reveal properties of mutable and less mutable PPIs. PPIs detected in the PPIseq screen can partially be correlated to co-expression (4A) as well as co-localization. Does it make sense to correlate the co-expression across number of conditions? Are the expression correlation condition specific. In this graph it could be that expression correlation stems from condition 1 and 2 and the interaction takes place in 4 and 5 still leading to the same conclusion ... Is the picture of the co-expression correlation similar when you simply look at individual environments like in S4A?

      We use co-expression mutual rank scores from the COXPRESdb v7.3 database (Obayashi et al., 2019). These mutual rank scores are derived from a broad set of 3593 environmental perturbations that are not limited to the environments we tested here. By using this data, we are asking if co-expression in general is correlated with mutability and report that it is in Figure 4A. We thank the reviewer for pointing out that this was not clear and have now added text to clarify that the co-expression analysis is derived from external data on page 6, line 7.

      We first asked whether co-expression is indeed a predictor of PPI mutability and found that it is: co-expression mutual rank (which is inversely proportional to co-expression across thousands of microarray experiments) declined with PPI mutability (Figures 4A and S11) (Obayashi and Kinoshita, 2009; Obayashi et al., 2019).

      The new figure S11 examines how the co-expression mutual rank changes with PPI mutability for PPIs identified in each environment, as the reviewer suggested. For each environment, we find the same general pattern as in Figure 4A (which considers PPIs from all environments).

      *) Figure 4C: Interesting, how dependent are the various categories?

      It is well known that many of these categories are correlated (e.g. mRNA expression level and protein abundance, and deletion fitness effect and genetic interaction degree). However, we believe it is most valuable to report the correlation of each category with PPI mutability independently in Figures 4C and S12, since similar correlations with related categories provide more confidence in our conclusions.

      *) Figure 4 F: When binned in the number of environments in which the PPI was found, the distribution peaks at 6 environments and decreases with higher and lower number of environments. The description /explanation in the text clearly says something else.

      We reported on page 7, line 15:

      We next used logistic regression to determine what features may underlie a good or poor fit to the model (Figure S14C) and found that PPI mutability was the best predictor, with more mutable PPIs being less frequently explained (Figure 4F). Unexpectedly, mean protein abundance was the second best predictor, with high abundance predicting a poor fit to the model, particularly for less mutable PPIs (Figure S14D and S14E).

      As the reviewer notes, Figure 4F shows that the percent of heterodimers explained by the model does appear to decrease for PPIs observed in the most environments. We suspect that the reviewer is correct that something more complicated is going on. One possibility is that extraordinarily stable PPIs (stable in all conditions) would have less quantitative variation in protein or PPI abundance across environments. If this is true, it would be statistically difficult to fit the mass action kinetics model for these PPIs (lower signal relative to noise), thereby resulting in the observed dip.

      A second possibility is that multiple correlated factors are associated with contributing positively or negatively to a good fit, and the simplicity of Figure 4F or a Pearson correlation does not capture this interplay. This second possibility is why we used multivariate logistic regression (Figure S14C) to dissect the major contributing factors. In the text quote above, we report that high abundance is anti-correlated with a good fit to the model (S14D, S14E). Figure 4C shows that immutable PPIs tend to be formed from highly abundant proteins. One possible explanation is that highly abundant proteins saturate the binding sites of their binding partners, breaking from the assumptions of mass action kinetics model. We have now changed the word “limit” to “saturate” on page 7, line 22 to make this concept more explicit.

      Taken together, these data suggest that mutable PPIs are subject to more post-translational regulation across environments and that high basal protein abundance may saturate the binding sites of their partners, limiting the ability of gene expression changes to regulate PPIs.

      A third possibility is that the dip is simply due to noise. Given the complexity of the possible explanations and our uncertainty about which is more likely, we chose to leave this description out of the main text and focus on the major finding: that PPIs detected in more environments are generally associated with a better fit to the mass action kinetics model.

      *) Figure 6: I apologize, but for my taste this is not a final figure 6 for this study. Investigation of different environments increases the PPI network in yeast, yes, yet it is very well known that a saturation is reached after testing of several conditions, different methods and even screening repetition (sampling). It does not represent an important outcome. Move to suppl or remove.

      We included Figure 6 to summarize and illustrate the path forward from this study. This is an explicit reference to impactful computational analyses done using earlier generations of data to assess the completeness of single-condition interaction networks (Hart et al., 2006; Sambourg and Thierry-Mieg, 2010). Here, we are extending PPI measurement of millions-scale networks across multiple environments, and are using this figure to extend these concepts to multi-condition screens. We agree that the property of saturation in sampling is well known, but it is surprising that we can quantitatively estimate convergence of this expanded condition-specific PPI set using only 9 conditions. Thus, we agree with Reviewer 1 that these are “remarkable results” and that the “upper limit is profoundly important to modelling cellular network complexity and, if it holds up, could define a general upper limit on organismal complexity.” We think this is an important advance of the paper, and this figure is useful to stimulate discussion and guide future work.

      Reviewer #2 (Significance (Required)):

      Liu et al. increase the current PPI network in yeast and offer a substantial dataset of novel PPIs seen in specific environments only. This resource can be used to further investigate the biological meaning of the PPI changes. The data set is compared to previous DHFR providing some sort of quality benchmarking. Mutable interactions are characterized well. Clearly a next step could be to start some "orthogonal" validation, i.e. beyond yeast growth under methotrexate treatment.

      The reviewer makes a great point that we also discuss on page 9, line 33:

      While we used reconstruction of C-terminal-attached mDHFR fragments as a reporter for PPI abundance, similar massively parallel assays could be constructed with different PCA reporters or tagging configurations to validate our observations and overcome false negatives that are specific to our reporter. Indeed, the recent development of “swap tag” libraries, where new markers can be inserted C- or N-terminal to most genes (Weill et al., 2018; Yofe et al., 2016), in combination with our iSeq double barcoder collection (Liu et al., 2019), makes extension of our approach eminently feasible.

      Reviewer 3

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      **Summary**

      The manuscript "A large accessory protein interactome is rewired across environments" by Liu et al. scales up a previously-described method (PPiSeq) to test a matrix of ~1.6 million protein pairs of direct protein-protein interactions in each of 9 different growth environments.

      While the study found a small fraction of immutable PPIs that are relatively stable across environments, the vast majority were 'mutable' across environments. Surprisingly, PPIs detected only in one environment made up more than 60% of the map. In addition to a false positive fraction that can yield apparently-mutable interactions, retest experiments demonstrate (not surprisingly) that environment-specificity can sometimes be attributed to false-negatives. The study authors predict that the whole subnetwork within the space tested will contain 11K true interactions.

      Much of environment-specific rewiring seemed to take place in an 'accessory module', which surrounds the core module made of mostly immutable PPIs. A number of interesting network clustering and functional enrichment analyses are performed to characterize the network overall and 'mutable' interactions in particular. The study report other global properties such as expression level, protein abundance and genetic interaction degree that differ between mutable and immutable PPIs. One of the interesting findings was evidence that many environmentally mutable PPI changes are regulated post-translationally. Finally, authors provide a case study about network rewiring related to glucose transport.

      **Major issues**

      -The results section should more prominently describe the dimensions of the matrix screen, both in terms of the set of protein pairs attempted and the set actually screened (I think this was 1741 x 1113 after filtering?). More importantly, the study should acknowledge in the introduction that this was NOT a random sample of protein pairs, but rather focused on pairs for which interaction had been previously observed in the baseline condition. This major bias has a potentially substantial impact on many of the downstream analyses. For example, any gene which was not expressed under the conditions of the original Tarrasov et al. study on which the screening space was based will not have been tested here. Thus, the study has systematically excluded interactions involving proteins with environment-dependent expression, except where they happened to be expressed in the single Tarrasov et al. environment. Heightened connectivity within the 'core module' may result from this bias, and if Tarrasov et al had screened in hydrogen peroxide (H2O2) instead of SD media, perhaps the network would have exhibited a code module in H2O2 decorated by less-densely connected accessory modules observed in other environments. The paper should clearly indicate which downstream analyses have special caveats in light of this design bias.

      We have now added text the matrix dimensions of our study on page 3, line 3:

      To generate a large PPiSeq library, all strains from the protein interactome (mDHFR-PCA) collection that were found to contain a protein likely to participate in at least one PPI (1742 X 1130 protein pairs), (Tarassov et al., 2008) were barcoded in duplicate using the double barcoder iSeq collection (Liu et al., 2019), and mated together in a single pool (Figure 1A). Double barcode sequencing revealed that the PPiSeq library contained 1.79 million protein pairs and 6.05 million double barcodes (92.3% and 78.1% of theoretical, respectively, 1741 X 1113 protein pairs), with each protein pair represented by an average of 3.4 unique double barcodes (Figure S1).

      We agree with the reviewer that our selection of proteins from a previously identified set can introduce bias in our conclusions. Our research question was focused on how PPIs change across environments, and thus we chose to maximize our power to detect PPI changes by selecting a set of protein pairs that are enriched for PPIs. We have now added a discussion of the potential caveats of this choice to the discussion on page 9, line 4:

      Results presented here and elsewhere (Huttlin et al., 2020) suggest that PPIs discovered under a single condition or cell type are a small subset of the full protein interactome emergent from a genome. We sampled nine diverse environments and found approximately 3-fold more interactions than in a single environment. However, the discovery of new PPIs began to saturate, indicating that most condition-specific PPIs can be captured in a limited number of conditions. Testing in many more conditions and with PPI assays orthogonal to PPiSeq will undoubtedly identify new PPIs, however a more important outcome could be the identification of coordinated network changes across conditions. Using a test set of ~1.6 million (of ~18 million) protein pairs across nine environments, we find that specific parts of the protein interactome are relatively stable (core modules) while others frequently change across environments (accessory modules). However, two important caveats of our study must be recognized before extrapolating these results to the entire protein interactome across all environment space. First, we tested for interactions between a biased set of proteins that have previously been found to participate in at least one PPI as measured by mDHFR-PCA under standard growth conditions (Tarassov et al., 2008). Thus, proteins that are not expressed under standard growth conditions are excluded from our study, as are PPIs that are not detectable by mDHFR-PCA or PPiSeq. It is possible that a comprehensive screen using multiple orthogonal PPI assays would alter our observations related to the relative dynamics of different regions of the protein interactome and the features of mutable and immutable PPIs. Second, we tested a limited number of environmental perturbations under similar growth conditions (batch liquid growth). It is possible that more extreme environmental shifts (e.g. growth as a colony, anaerobic growth, pseudohyphal growth) would introduce new accessory modules or alter the mutability of the PPIs we detect. Nevertheless, results presented here provide a new mechanistic view of how the cell changes in response to environmental challenges, building on the previous work that describes coordinated responses in the transcriptome (Brauer et al., 2007; Gasch et al., 2000) and proteome (Breker et al., 2013; Chong et al., 2015).

      -Related to the previous issue, a quick look at the proteins tested (if I understood them correctly) showed that they were enriched for genes encoding the elongator holoenzyme complex, DNA-directed RNA polymerase I complex, membrane docking and actin binding proteins, among other functional enrichments. Genes related to DNA damage (endonuclease activity and transposition), were depleted. It was unclear whether the functional enrichment analyses described in the paper reported enrichments relative to what would be expected given the bias inherent to the tested space?

      We did two functional enrichment analyses in this study: network density within Gene Ontology terms (related to Figure 2) and gene ontology enrichment of network communities (related to Figure 3). For both analyses, we performed comparisons to proteins included in PPiSeq library. This is described in the Supplementary Materials on page 63, line 35:

      To estimate GO term enrichment in our PPI network, we constructed 1000 random networks by replacing each bait or prey protein that was involved in a PPI with a randomly chosen protein from all proteins in our screen. This randomization preserves the degree distribution of the network.

      And on page 66, line 38:

      The set of proteins used for enrichment comparison are proteins that are involved in at least one PPI as determined by PPiSeq.

      -Re: data quality. To the study's great credit, they incorporated positive and random reference sets (PRS and RRS) into the screen. However, the results from this were concerning: Table SM6 shows that assay stringency was set such that between 1 and 3 out of 67 RRS pairs were detected. This specificity would be fine for an assay intended for retest or validate previous hits, where the prior probability of a true interaction is high, but in large-scale screening the prior probability of true interactions that are detectable by PCA is much lower, and a higher specificity is needed to avoid being overwhelmed by false positives. Consider this back of the envelope calculation: Let's say that the prior probability of true interaction is 1% as the authors' suggest (pg 49, section 6.5), and if PCA can optimistically detect 30% of these pairs, then the number of true interactions we might expect to see in an RRS of size 67 is 1% * 30% * 67 = 0.2 . This back of the envelope calculation suggests that a stringency allowing 1 hit in RRS will yield 80% [ (1 - 0.2) / 1 ] false positives, and a stringency allowing 3 hits in RRS will yield 93% [ (3 - 0.2) / 3] false positives. How do the authors reconcile these back of the envelope calculations from their PRS and RRS results with their estimates of precision?

      We thank the reviewer for bringing up with this issue. We included positive and random reference sets (PRS:70 protein pairs, RRS:67 protein pairs) to benchmark our PPI calling (Yu et al., 2008). The PRS reference lists PPIs that have been validated by multiple independent studies and is therefore likely to represent true PPIs that are present in some subset of the environments we tested. For the PRS set, we found a rate of detection that is comparable to other studies (PPiSeq in SD: 28%, Y2H and yellow fluorescent protein-PCA: ~20%) (Yu et al., 2008). The RRS reference, developed ten years ago, is randomly chosen protein pairs for which there was no evidence of a PPI in the literature at the time (mostly in standard growth conditions). Given the relatively high rate of false negatives in PPI assays, this set may in fact contain some true PPIs that have yet to be discovered. We could detect PPIs for four RRS protein pairs in our study, when looking across all 9 environments. Three of these (Grs1_Pet10, Rck2_Csh1, and YDR492W_Rpd3) could be detected in multiple environments (9, 7, and 3, respectively), suggesting that their detection was not a statistical or experimental artifact of our bar-seq assay (see table below derived from Table S4). The remaining PPI detected in the RRS, was only detected in SD (standard growth conditions) but with a relatively high fitness (0.35), again suggesting its detection was not a statistical or experimental artifact. While we do acknowledge it is possible that these are indeed false positives due to erroneous interactions of chimeric DHFR-tagged versions of these proteins, the small size of the RRS combined with the fact that some of the protein pairs could be true PPIs, did not give us confidence that this rate (4 of 70) is representative of our true false positive rate. To determine a false positive rate that is less subject to biases stemming from sampling of small numbers, we instead generated 50 new, larger random reference sets, by sampling for each set ~ 60,000 protein pairs without a reported PPI in BioGRID. Using these new reference sets, we found that the putative false positive rate of our assay is generally lower than 0.3% across conditions for each of the 50 reference sets. We therefore used this more statistically robust measure of the false positive rate to estimate positive predictive values (PPV = 62%, TPR = 41% in SD). We detail these statistical methods in Section 6 of the supplementary methods and report all statistical metrics in Table SM6.

      PPI

      Environment_number

      SD

      H2O2

      Hydroxyurea

      Doxorubicin

      Forskolin

      Raffinose

      NaCl

      16℃

      FK506

      Rck2_Csh1

      7

      0.35

      0.35

      0

      0.20

      0.54

      0.74

      0

      0.17

      0.59

      Grs1_Pet10

      9

      0.44

      0.39

      0.34

      0.25

      0.65

      1.19

      0.2

      0.16

      0.95

      YDR492W_Rpd3

      3

      0

      0.18

      0

      0

      0

      0

      0

      0.17

      0.61

      Mrps35_Bub3

      1

      0.35

      0

      0

      0

      0

      0

      0

      0

      0

      Positive_control

      9

      1

      0.8

      0.73

      0.62

      1.4

      2.44

      0.4

      0.28

      1.8

      Table. Mean fitness in each environment

      -Methods for estimating precision and recall were not sufficiently well described to assess. Precision vs recall plots would be helpful to better understand this tradeoff as score thresholds were evaluated.

      We describe in detail our approach to calling PPIs in section 6.6 of the supplementary methods, including Table SM6, and Figures SM3, SM4, SM6, and now Figure SM5. We identified positive PPIs using a dynamic threshold that considers the mean fitness and p-value in each environment. For each dynamic threshold, we estimated the precision and recall based on the reference sets (described supplementary methods in section 6.5). We then chose the threshold with the maximal Matthews correlation coefficient (MCC) to obtain the best balance between precision and recall. We have now added an additional plot (Figure SM5) that shows the precision and recall for the chosen dynamic threshold in each environment.

      -Within the tested space, the Tarassov et al map and the current map could each be compared against a common 'bronze standard' (e.g. literature curated interactions), at least for the SD map, to have an idea about how the quality of the current map compares to that of the previous PCA map. Each could also be compared with the most recent large-scale Y2H study (Yu et al).

      We thank the reviewer for this suggestion. We have now added a figure panel (Figure S4) that compares PPiSeq in SD (2 replicates) to mDHFR PCA (Tarassov et al., 2008), Y2H (Yu et al., 2008), and our newly constructed ‘bronze standard’ high-confidence positive reference set (PRS, supplementary method section 6.4).

      • Experimental validation of the network was done by conventional PCA. However, it should be noted that this is a form of technical replication of the DHFR-based PCA assay, and not a truly independent validation. Other large-scale yeast interaction studies (e.g., Yu et al, Science 2008) have assessed a random subset of observed PPIs using an orthogonal approach, calibrated using PRS and RRS sets examined via the same orthogonal method, from which overall performance of the dataset could be determined.

      We appreciate the reviewer’s perspective, since orthogonal validation experiments have been a critical tool to establish assay performance following early Y2H work. We know from careful work done previously that modern orthogonal assays have a low cross validation rate ((Yu et al., 2008) and that they tend to be enriched for PPIs in different cellular compartments (Jensen and Bork, 2008), indicating that high false negative rates are the likely explanation. High false negative rates have been confirmed here and elsewhere using positive reference sets (e.g. Y2H 80%, PCA 80%, PPiSeq 74% using the PRS in (Yu et al., 2008)). Therefore, the expectation is that PPiSeq, as with other assays, will have a low rate of validation using an orthogonal assay -- although we would not know if this rate is 10%, 30% or somewhere in between without performing the work. However, the exact number -- whether it be 10% or 30% -- has no practical impact on the main conclusions of this study (focused on network dynamics rather than network enumeration). Neither does that number speak to the confidence in our PPI calls, since a lower number may simply be due to less overlap in the sets of PPIs that are callable by PPiSeq and another assay. Our method uses bar-seq to extend an established mDHFR-PCA assay (Tarassov et al., 2008). The validations we performed were aimed at confirming that our sequencing, barcode counting, fitness estimation, and PPI calling protocols were not introducing excessive noise relative to mDHFR-PCA that resulted in a high number of PPI miscalls. Confirming this, we do indeed find a high rate of validation by lower throughput PCA (50-90%, Figure 3B). Finally, we do include independent tests of the quality of our data by comparing it to positive and random reference sets from literature curated data. We find that our assay performs extremely well (PPV > 61%, TPR > 41%) relative to other high-throughput assays.

      -The Venn diagram in Figure 1G was not very informative in terms of assessing the quality of data. It looks like there is a relatively little overlap between PPIs identified in standard conditions (SD media) in the current study and those of the previous study using a very similar method. Is there any way to know how much of this disagreement can be attributed to each screen being sub-saturation (e.g. by comparing replica screens) and what fraction to systematic assay or environment differences?

      We have now added a figure panel (Figure S4) that compares PPiSeq in SD (2 replicates) to mDHFR-PCA (Tarassov et al., 2008), Y2H (Yu et al., 2008), and our newly constructed ‘bronze standard’ high-confidence positive reference sets (PRS, supplementary methods section 6.4). We find that SD replicates have an overlap coefficient of 79% with each other, ~45% with mDHFR-PCA, ~45% the ‘bronze standard’ PRS, and ~13% with Y2H. Overlap coefficients between the SD replicates and mDHFR-PCA are much higher than those found between orthologous methods ((Yu et al., 2008), indicating that these two assays are identifying a similar set of PPIs. We do note that PPiSeq and mDHFR-PCA do screen for PPIs under different growth conditions (batch liquid growth vs. colonies on agar), so some fraction of the disagreement is due to environmental differences. PPIs that overlap between the two PPiSeq SD replicates are more likely to be found in mDHFR-PCA, PRS, and Y2H, indicating that PPIs identified in a single SD replicate are more likely to be false positives. However, we do find (a lower rate of) overlaps between PPIs identified in only one SD replicate and other methods, suggesting that a single PPiSeq replicate is not finding all discoverable PPIs.

      -In Figure S5C, the environment-specificity rate of PPIs might be inflated due to the fact that authors only test for the absence of SD hits in other conditions, and the SD condition is the only condition that has been sampled twice during the screening. What would be the environment-specific verification rate if sample hits from each environment were tested in all environments? This seems important, as robustly detecting environment-specific PPIs is one of the key points of the study.

      We use PPIs found in the SD environment to determine the environment-specificity because this provides the most conservative (highest) estimate of the number of PPIs found in other environments that were not detectable by our bar-seq assay. To identify PPIs in the SD environment, we pooled fitness estimates across the two replicates (~ 4 fitness estimates per replicate, ~ 8 total). The higher number of replicates results in a reduced rate of false positives (an erroneous fitness estimate has less impact on a PPI call), meaning that we are more confident that PPIs identified in SD are true positives. Because false positives in one environment (but not other environments) are likely to erroneously contribute to the environment-specificity rate, choosing the environment with the lowest rate of false positives (SD) should result in the lowest environment-specificity rate (highest estimate of PPIs found in other environments that were not detectable by our bar-seq assay).

      **Minor issues**

      -Re: "An interaction between the proteins reconstitutes mDHFR, providing resistance to the drug methotrexate and a growth advantage that is proportional to the PPI abundance" (pg 2). It may be more accurate to say "monotonically related" than "proportional" here. Fig 2 from the cited Freschi et al ref does suggests linearity with colony size over a wide range of inferred complex abundances, but non-linear at low complex abundance. Also note that Freschi measured colony area which is not linear with exponential growth rate nor with cell count.

      We agree with the reviewer and have changed “proportional” to “monotonically related” on page 2, line 41.

      -Re: "Using putatively positive and negative reference sets, we empirically determined a statistical threshold for each environment with the best balance of precision and recall (positive predictive value (PPV) > 61% in SD media, Methods, section 6)." (pg 3). Should state the recall at this PPV.

      We agree with the reviewer and have added the recall (41%) in the main text (line 26, page3).

      Using putatively positive and negative reference sets, we empirically determined a statistical threshold for each environment with the best balance of precision and recall (positive predictive value (PPV) > 61% and true positive rate > 41% in SD media, Methods, section 6).

      -Authors could discuss the extent to which related methods (e.g. PMID: 28650476, PMID: 27107012, PMID: 29165646, PMID: 30217970) would be potentially suitable for screening in different environments.

      We have now added a reference to a barcode-based Y2H study that examined interactions between yeast proteins to the introduction on page 2, line 2:

      Yet, little is known about how PPI networks reorganize on a global scale or what drives these changes. One challenge is that commonly-used high-throughput PPI screening technologies are geared toward PPI identification (Gavin et al., 2002; Ito et al., 2001; Tarassov et al., 2008; Uetz et al., 2000; Yu et al., 2008, Yachie et al., 2016), not a quantitative analysis of relative PPI abundance that is necessary to determine if changes in the PPI network are occurring. The murine dihydrofolate reductase (mDHFR)‐based protein-fragment complementation assay (PCA) provides a viable path to characterize PPI abundance changes because it is a sensitive test for PPIs in the native cellular context and at native protein expression levels (Freschi et al., 2013; Remy and Michnick, 1999; Tarassov et al., 2008).

      We have excluded the references to other barcode-based Y2H studies that reviewer mentions because they test heterologous proteins within yeast, and the effect of perturbations to yeast on these proteins would be difficult to interpret in the context of our questions. The yeast protein Y2H study, although a wonderful approach and paper, would also not be an appropriate method to examine how PPI networks change across environments because protein fusions are not expressed under their endogenous promoters and must be transported to, in many cases, a non-native compartment (cell nucleus) to be detected. Rather than explicitly discuss the caveats of this particular approach, we have instead chosen to discuss why we use PCA.

      • the term "mutable" is certainly appropriate according to the dictionary definition of changeable. The authors may wish to consider though, that in a molecular biology context the term evokes changeability by mutation (a very interesting but distinct topic). Maybe another term (environment-dependent interactions or ePPIs?) would be clearer. Of course this is the authors' call.

      We thank the reviewer for this suggestion, and have admittedly struggled with the terminology. For clarity of presentation, we strived to have a single word that describes the property of a PPI that is at the core of this manuscript -- how frequently a PPI is found across environments. However, the most descriptive words come with preloaded meanings in PPI research (e.g. transient, stable, dynamic), as does “mutable” with another research field. We are, quite frankly, open to suggestions from the reviewers or editors for a more appropriate word that does not raise similar objections.

      -Some discussion is warranted about the phenomenon that a PPI that is unchanged in abundance could appear to change because of statistical significance thresholds that differ between screens. This would be a difficult question for any such study, and I don't think the authors need to solve it, but just to discuss.

      We agree with the reviewer that significance thresholds could be impacting our interpretations and discuss this idea at length on page 4, line 23 of the Results. This section has been modified to include an additional analysis (excluding 16 ℃ data) in response to another reviewer’s comment:

      Immutable PPIs were likely to have been previously reported by colony-based mDHFR-PCA or other methods, while the PPIs found in the fewest environments were not. One possible explanation for this observation is that previous PPI assays, which largely tested in standard laboratory growth conditions, and variations thereof, are biased toward identification of the least mutable PPIs. That is, since immutable PPIs are found in nearly all environments, they are more readily observed in just one. However, another possible explanation is that, in our assay, mutable PPIs are more likely to be false positives in environment(s) in which they are identified or false negatives in environments in which they are not identified. To investigate this second possibility, we first asked whether PPIs present in very few environments have lower fitnesses, as this might indicate that they are closer to our limit of detection. We found no such pattern: mean fitnesses were roughly consistent across PPIs found in 1 to 6 conditions, although they were elevated in PPIs found in 7-9 conditions (Figure S6A). To directly test the false-positive rate stemming from pooled growth and barcode sequencing, we validated randomly selected PPIs within each mutability bin by comparing their optical density growth trajectories against controls (Figures 3B). We found that mutable PPIs did indeed have lower validation rates in the environment in which they were identified, yet putative false positives were limited to ~50%, and, within a bin, do not differ between PPIs that have been previously identified and those that have been newly discovered by our assay (Figure S65B). We also note mutable PPIs might be more sensitive to environmental differences between our large pooled PPiSeq assays and clonal 96-well validation assays, indicating that differences in validation rates might be overstated. To test the false-negative rate, we assayed PPIs identified in only SD by PPiSeq across all other environments by optical density growth and found that PPIs can be assigned to additional environments (Figure S6C). However, the number of additional environments in which a PPI was detected was generally low (2.5 on average), and the interaction signal in other environments was generally weaker than in SD (Figure S6D). To better estimate how the number of PPIs changes with PPI mutability, we used these optical density assays to model the validation rate as a function of the mean PPiSeq fitness and the number of environments in which a PPI is detected. This accurate model (Spearman's r =0.98 between predicted and observed, see Methods) provided confidence scores (predicted validation rates) for each PPI (Table S5) and allowed us to adjust the true positive PPI estimate in each mutability bin. Using this more conservative estimate, we still found a preponderance of mutable PPIs (Figure S6E). Finally, we used a pair of more conservative PPI calling procedures that either identified PPIs with a low rate of false positives across all environments (FPR

      We later examine major conclusions of our study using more conservative calling procedures, and find that they are consistent. On page 6, line 14:

      Both the co-expression and co-localization patterns were also apparent in our higher confidence PPI sets (Figures S7B, and S7C, S8B, S8C ), indicating that they are not caused by different false positive rates between the mutability bins.

      And on page 6, line 19:

      We binned proteins by their PPI degree, and, within each bin, determined the correlation between the mutability score and another gene feature (Figure 4C and S12A, Table S8) (Costanzo et al., 2016; Finn et al., 2014; Gavin et al., 2006; Holstege et al., 1998; Krogan et al., 2006; Levy and Siegal, 2008; Myers et al., 2006; Newman et al., 2006; Östlund et al., 2010; Rice et al., 2000; Stark et al., 2011; Wapinski et al., 2007; Ward et al., 2004; Yang, 2007; Yu et al., 2008). These correlations were also calculated using our higher confidence PPI sets, confirming results from the full data set (Figures S7D and, S7E, S8D, S8E). We found that mutable hubs (> 15 PPIs) have more genetic interactions, in agreement with predictions from co-expression data (Bertin et al., 2007; Han et al., 2004), and that their deletion tends to cause larger fitness defects.

      -More discussion would be helpful about the idea that immutability may to some extent favor interactions that PCA is better able to detect (possibly including membrane proteins?)

      We agree with the reviewer and now added a discussion of this potential caveats to the discussion on page 9, line 4:

      Results presented here and elsewhere (Huttlin et al., 2020) suggest that PPIs discovered under a single condition or cell type are a small subset of the full protein interactome emergent from a genome. We sampled nine diverse environments and found approximately 3-fold more interactions than in a single environment. However, the discovery of new PPIs began to saturate, indicating that most condition-specific PPIs can be captured in a limited number of conditions. Testing in many more conditions and with PPI assays orthogonal to PPiSeq will undoubtedly identify new PPIs, however a more important outcome could be the identification of coordinated network changes across conditions. Using a test set of ~1.6 million (of ~18 million) protein pairs across nine environments, we find that specific parts of the protein interactome are relatively stable (core modules) while others frequently change across environments (accessory modules). However, two important caveats of our study must be recognized before extrapolating these results to the entire protein interactome across all environment space. First, we tested for interactions between a biased set of proteins that have previously been found to participate in at least one PPI as measured by mDHFR-PCA under standard growth conditions (Tarassov et al., 2008). Thus, proteins that are not expressed under standard growth conditions are excluded from our study, as are PPIs that are not detectable by mDHFR-PCA or PPiSeq. It is possible that a comprehensive screen using multiple orthogonal PPI assays would alter our observations related to the relative dynamics of different regions of the protein interactome and the features of mutable and immutable PPIs. Second, we tested a limited number of environmental perturbations under similar growth conditions (batch liquid growth). It is possible that more extreme environmental shifts (e.g. growth as a colony, anaerobic growth, pseudohyphal growth) would introduce new accessory modules or alter the mutability of the PPIs we detect. Nevertheless, results presented here provide a new mechanistic view of how the cell changes in response to environmental challenges, building on the previous work that describes coordinated responses in the transcriptome (Brauer et al., 2007; Gasch et al., 2000) and proteome (Breker et al., 2013; Chong et al., 2015).

      -Re: "As might be expected, we also found that mutable hubs, but not non-hubs, are more likely to participate in multiple protein complexes than less mutable proteins." (pg 6) This is a cool result. To what extent was this result driven by members of one or two complexes? If so, it would worth noting them.

      We thank the reviewer for this question. We have now included Figue S13, which shows the number and size of protein complexes that underlie the finding that mutable hubs are more likely to participate in multiple protein complexes. We find that proteins in our screen that participate in multiple complexes are distributed over a wide range of complexes, indicating that this observation is not driven by one or two complexes. On page 6, line 34:

      As might be expected, we also found that mutable hubs, but not non-hubs, are more likely to participate in multiple protein complexes than less mutable proteins (Figures S13A-C) (Costanzo et al., 2016).

      -Re: "Borrowing a species richness estimator from ecology (Jari Oksanen et al., 2019), we estimate that there are ~10,840 true interactions within our search space across all environments, ~3-fold more than are detected in SD (note difference to Figure 3, which counts observed PPIs)." (pg 8) Should note that this only allows estimation of the number of interactions that are detectable by PCA methods. Previous work (Braun et al, 2019) showed that every known protein interaction assay (including PCA approaches) can only detect a fraction of bona fide interactions.

      We agree with the reviewer and have modified the discussion to make this point explicit on page 9, line 4:

      Results presented here and elsewhere (Huttlin et al., 2020) suggest that PPIs discovered under a single condition or cell type are a small subset of the full protein interactome emergent from a genome. We sampled nine diverse environments and found approximately 3-fold more interactions than in a single environment. However, the discovery of new PPIs began to saturate, indicating that most condition-specific PPIs can be captured in a limited number of conditions. Testing in many more conditions and with PPI assays orthogonal to PPiSeq will undoubtedly identify new PPIs, however a more important outcome could be the identification of coordinated network changes across conditions.

      We continue in this paragraph to discuss the implications:

      Using a test set of ~1.6 million (of ~18 million) protein pairs across nine environments, we find that specific parts of the protein interactome are relatively stable (core modules) while others frequently change across environments (accessory modules). However, two important caveats of our study must be recognized before extrapolating these results to the entire protein interactome across all environment space. First, we tested for interactions between a biased set of proteins that have previously been found to participate in at least one PPI as measured by mDHFR-PCA under standard growth conditions (Tarassov et al., 2008). Thus, proteins that are not expressed under standard growth conditions are excluded from our study, as are PPIs that are not detectable by mDHFR-PCA or PPiSeq. It is possible that a comprehensive screen using multiple orthogonal PPI assays would alter our observations related to the relative dynamics of different regions of the protein interactome and the features of mutable and immutable PPIs.

      -Re: "This analysis shows that the number of PPIs present across all environments is much larger than the number observed in a single condition, but that it is feasible to discover most of these new PPIs by sampling a limited number of conditions." (pg 8). The main point is surely correct, but it is worth noting that extrapolation to the number of true interactions depends on the nine chosen environments being representative of all environments. The situation could change under more extreme, e.g., anaerobic, conditions.

      We agree with the reviewer and make this point explicit, continuing from the paragraph quoted above on page 9, line 22:

      Second, we tested a limited number of environmental perturbations under similar growth conditions (batch liquid growth). It is possible that more extreme environmental shifts (e.g. growth as a colony, anaerobic growth, pseudohyphal growth) would introduce new accessory modules or alter the mutability of the PPIs we detect. Nevertheless, results presented here provide a new mechanistic view of how the cell changes in response to environmental challenges, building on the previous work that describes coordinated responses in the transcriptome (Brauer et al., 2007; Gasch et al., 2000) and proteome (Breker et al., 2013; Chong et al., 2015).

      -It stands to reason that proteins expressed in all conditions will yield less mutable interactions, if 'mutability' is primarily due to expression change at the transcriptional level. They should at least discuss that measuring mRNA levels could resolve questions about this. Could use Waern et al G3 2013 data (H202, SD, HU, NaCl) to predict the dynamic interactome purely by node removal, and see how conclusions would change

      We agree with the reviewer that mRNA abundance could potentially be used as a proxy for protein abundance and have added this point on page 10, line 28:

      Here we use homodimer abundance as a proxy for protein abundance. However, genome-wide mRNA abundance measures could be used as a proxy for protein abundance or protein abundance could be measured directly in the same pool (Levy et al., 2014) by, for example, attaching a full length mDHFR to each gene using “swap tag” libraries mentioned above (Weill et al., 2018; Yofe et al., 2016).

      However, using mRNA abundance as a proxy for protein abundance in this study has several important caveats that would make interpretation difficult. First, mRNA and protein abundance correlate, but not perfectly (R2 = 0.45) (Lahtvee et al., 2017), and our findings suggest that post-translational regulation may be important to driving PPI changes. Second, mRNA abundance measures are for a single time point, while our PPI measures coarse grain over a growth cycle (lag, exponential growth, diauxic shift, saturation). Although we may be able to take multiple mRNA measures across the cycle, time delays between changes in mRNA and protein levels, combined with the fact that we do not know when a PPI is occurring or most prominent over the cycle, would pose a significant challenge to making any claims that PPI changes are driven by changes in protein abundance. We instead chose to focus on a subset of proteins (homodimers) where abundance measures can be coarse grained in the same way as PPI measures. In the above quote, we point to a potential method by which this can be done for all proteins. We also point to how a continuous culturing design could be used to better determine how protein (or mRNA proxy) abundance impacts PPI abundance on page 10, line 6:

      Finally, our assays were performed across cycles of batch growth meaning that changes in PPI abundance across a growth cycle (e.g. lag, exponential growth, saturation) are coarse grained into one measurement. While this method potentially increases our chance of discovering a diverse set of PPIs, it might have an unpredictable impact on the relationship between fitness and PPI abundance (Li et al., 2018). To overcome these issues, strains containing natural or synthetic PPIs with known abundances and intracellular localizations could be spiked into cell pools to calibrate the relationship between fitness and PPI abundance in each environment. In addition, continuous culturing systems may be useful for refining precision of growth-based assays such as ours.

      -The analysis showing that many interactions are likely due to post-translational modifications is very interesting, but caveats should be discussed. Where heterodimers do not fit the expression-level dependence model, some cases of non-fitting may simply be due to measurement error or non-linearity in the relationship between abundance and fitness.

      We show the measurement error in Figures 1, S2, S3. While we agree with the reviewer that measurement error is a general caveat for all results reported, we do not feel that it is necessary to point to that fact in this particular case, which uses a logistic regression to report that PPI mutability was the best predictor of fit to the expression-level dependence model. We discuss the non-linearity caveat on page 9, line 41:

      Our assay detected subtle fitness differences across environments (Fig S5B and S5C), which we used as a rough estimate for changes in relative PPI abundance. While it would be tempting to use fitness as a direct readout of absolute PPI abundance within a cell, non-linearities between fitness and PPI abundance may be common and PPI dependent. For example, the relative contribution of a reconstructed mDHFR molecule to fitness might diminish at high PPI abundances (saturation effects) and fitness differences between PPIs may be caused, in part, by differences in how accessible a reconstructed mDHFR molecule is to substrate. In addition, environmental shifts might impact cell growth rate, initiate a stress response, or result in other unpredictable cell effects that impact the selective pressure of methotrexate and thereby fitness (Figure S2 and S3).

      -Line numbers would have been helpful to note more specific minor comments

      We are sorry for this inconvenience. We have added line numbers in our revised manuscript.

      -Sequence data should be shared via the Short-Read Archive.

      The raw sequencing data have been uploaded to the Short-Read Archive. We mentioned it in the Data and Software Availability section on page 68, line 41.

      Raw barcode sequencing data are available from the NIH Sequence Read Archive as accession PRJNA630095 (https://trace.ncbi.nlm.nih.gov/Traces/study/?acc=SRP259652).

      Reviewer #3 (Significance (Required)):

      Knowledge of protein-protein interactions (PPIs) provides a key window on biological mechanism, and unbiased screens have informed global principles underlying cellular organization. Several genome-scale screens for direct (binary) interactions between yeast proteins have been carried out, and while each has provided a wealth of new hypotheses, each has been sub-saturation. Therefore, even given multiple genome-scale screens our knowledge of yeast interactions remains incomplete. Different assays are better suited to find different interactions, and it is now clear that every assay evaluated thus far is only capable (even in a saturated screen) of detecting a minority of true interactions. More relevant to the current study, no binary interaction screen has been carried out at the scale of millions of protein pairs outside of a single 'baseline' condition.

      The study by Liu et al is notable from a technology perspective in that it is one of several recombinant-barcode approaches have been developed to multiplex pairwise combinations of two barcoded libraries. Although other methods have been demonstrated at the scale of 1M protein pairs, this is the first study using such a technology at the scale of >1M pairs across multiple environments.

      A limitation is that this study is not genome-scale, and the search space is biased towards proteins for which interactions were previously observed in a particular environment. This is perhaps understandable, as it made the study more tractable, but this does add caveats to many of the conclusions drawn. These would be acceptable if clearly described and discussed. There were also questions about data quality and assessment that would need to be addressed.

      Assuming issues can be addressed, this is a timely study on an important topic, and will be of broad interest given the importance of protein interactions and the status of S. cerevisiae as a key testbed for systems biology.

      *Reviewers' expertise:* Interaction assays, next-generation sequencing, computational genomics. Less able to assess evolutionary biology aspects.

      References

      Brauer, M.J., Huttenhower, C., Airoldi, E.M., Rosenstein, R., Matese, J.C., Gresham, D., Boer, V.M., Troyanskaya, O.G., and Botstein, D. (2007). Coordination of Growth Rate, Cell Cycle, Stress Response, and Metabolic Activity in Yeast. Mol. Biol. Cell 19, 352–367.

      Breker, M., Gymrek, M., and Schuldiner, M. (2013). A novel single-cell screening platform reveals proteome plasticity during yeast stress responses. J. Cell Biol. 200, 839–850.

      Chong, Y.T., Koh, J.L.Y., Friesen, H., Kaluarachchi Duffy, S., Cox, M.J., Moses, A., Moffat, J., Boone, C., and Andrews, B.J. (2015). Yeast Proteome Dynamics from Single Cell Imaging and Automated Analysis. Cell 161, 1413–1424.

      Gasch, A.P., Spellman, P.T., Kao, C.M., Carmel-Harel, O., Eisen, M.B., Storz, G., Botstein, D., and Brown, P.O. (2000). Genomic Expression Programs in the Response of Yeast Cells to Environmental Changes. Mol. Biol. Cell 11, 4241–4257.

      Hart, G.T., Ramani, A.K., and Marcotte, E.M. (2006). How complete are current yeast and human protein-interaction networks? Genome Biol. 7, 120.

      Hilliker, A., Gao, Z., Jankowsky, E., and Parker, R. (2011). The DEAD-box protein Ded1 modulates translation by the formation and resolution of an eIF4F-mRNA complex. Mol. Cell 43, 962–972.

      Isasa, M., Suñer, C., Díaz, M., Puig-Sàrries, P., Zuin, A., Bichmann, A., Gygi, S.P., Rebollo, E., and Crosas, B. (2015). Cold Temperature Induces the Reprogramming of Proteolytic Pathways in Yeast. J. Biol. Chem. jbc.M115.698662.

      Jensen, L.J., and Bork, P. (2008). Not Comparable, But Complementary. Science 322, 56–57.

      Lahtvee, P.-J., Sánchez, B.J., Smialowska, A., Kasvandik, S., Elsemman, I.E., Gatto, F., and Nielsen, J. (2017). Absolute Quantification of Protein and mRNA Abundances Demonstrate Variability in Gene-Specific Translation Efficiency in Yeast. Cell Syst. 4, 495-504.e5.

      Obayashi, T., Kagaya, Y., Aoki, Y., Tadaka, S., and Kinoshita, K. (2019). COXPRESdb v7: a gene coexpression database for 11 animal species supported by 23 coexpression platforms for technical evaluation and evolutionary inference. Nucleic Acids Res. 47, D55–D62.

      Sambourg, L., and Thierry-Mieg, N. (2010). New insights into protein-protein interaction data lead to increased estimates of the S. cerevisiae interactome size. BMC Bioinformatics 11, 605.

      Tarassov, K., Messier, V., Landry, C.R., Radinovic, S., Molina, M.M.S., Shames, I., Malitskaya, Y., Vogel, J., Bussey, H., and Michnick, S.W. (2008). An in Vivo Map of the Yeast Protein Interactome. Science 320, 1465–1470.

      Yu, H., Braun, P., Yıldırım, M.A., Lemmens, I., Venkatesan, K., Sahalie, J., Hirozane-Kishikawa, T., Gebreab, F., Li, N., Simonis, N., et al. (2008). High-Quality Binary Protein Interaction Map of the Yeast Interactome Network. Science 322, 104–110.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Summary

      The manuscript "A large accessory protein interactome is rewired across environments" by Liu et al. scales up a previously-described method (PPiSeq) to test a matrix of ~1.6 million protein pairs of direct protein-protein interactions in each of 9 different growth environments.

      While the study found a small fraction of immutable PPIs that are relatively stable across environments, the vast majority were 'mutable' across environments. Surprisingly, PPIs detected only in one environment made up more than 60% of the map. In addition to a false positive fraction that can yield apparently-mutable interactions, retest experiments demonstrate (not surprisingly) that environment-specificity can sometimes be attributed to false-negatives. The study authors predict that the whole subnetwork within the space tested will contain 11K true interactions.

      Much of environment-specific rewiring seemed to take place in an 'accessory module', which surrounds the core module made of mostly immutable PPIs. A number of interesting network clustering and functional enrichment analyses are performed to characterize the network overall and 'mutable' interactions in particular. The study report other global properties such as expression level, protein abundance and genetic interaction degree that differ between mutable and immutable PPIs. One of the interesting findings was evidence that many environmentally mutable PPI changes are regulated post-translationally. Finally, authors provide a case study about network rewiring related to glucose transport.

      Major issues

      -The results section should more prominently describe the dimensions of the matrix screen, both in terms of the set of protein pairs attempted and the set actually screened (I think this was 1741 x 1113 after filtering?). More importantly, the study should acknowledge in the introduction that this was NOT a random sample of protein pairs, but rather focused on pairs for which interaction had been previously observed in the baseline condition. This major bias has a potentially substantial impact on many of the downstream analyses. For example, any gene which was not expressed under the conditions of the original Tarrasov et al. study on which the screening space was based will not have been tested here. Thus, the study has systematically excluded interactions involving proteins with environment-dependent expression, except where they happened to be expressed in the single Tarrasov et al. environment. Heightened connectivity within the 'core module' may result from this bias, and if Tarrasov et al had screened in hydrogen peroxide (H2O2) instead of SD media, perhaps the network would have exhibited a code module in H2O2 decorated by less-densely connected accessory modules observed in other environments. The paper should clearly indicate which downstream analyses have special caveats in light of this design bias.

      -Related to the previous issue, a quick look at the proteins tested (if I understood them correctly) showed that they were enriched for genes encoding the elongator holoenzyme complex, DNA-directed RNA polymerase I complex, membrane docking and actin binding proteins, among other functional enrichments. Genes related to DNA damage (endonuclease activity and transposition), were depleted. It was unclear whether the functional enrichment analyses described in the paper reported enrichments relative to what would be expected given the bias inherent to the tested space?

      -Re: data quality. To the study's great credit, they incorporated positive and random reference sets (PRS and RRS) into the screen. However, the results from this were concerning: Table SM6 shows that assay stringency was set such that between 1 and 3 out of 67 RRS pairs were detected. This specificity would be fine for an assay intended for retest or validate previous hits, where the prior probability of a true interaction is high, but in large-scale screening the prior probability of true interactions that are detectable by PCA is much lower, and a higher specificity is needed to avoid being overwhelmed by false positives. Consider this back of the envelope calculation: Let's say that the prior probability of true interaction is 1% as the authors' suggest (pg 49, section 6.5), and if PCA can optimistically detect 30% of these pairs, then the number of true interactions we might expect to see in an RRS of size 67 is 1% 30% 67 = 0.2 . This back of the envelope calculation suggests that a stringency allowing 1 hit in RRS will yield 80% [ (1 - 0.2) / 1 ] false positives, and a stringency allowing 3 hits in RRS will yield 93% [ (3 - 0.2) / 3] false positives. How do the authors reconcile these back of the envelope calculations from their PRS and RRS results with their estimates of precision?

      -Methods for estimating precision and recall were not sufficiently well described to assess. Precision vs recall plots would be helpful to better understand this tradeoff as score thresholds were evaluated.

      -Within the tested space, the Tarassov et al map and the current map could each be compared against a common 'bronze standard' (e.g. literature curated interactions), at least for the SD map, to have an idea about how the quality of the current map compares to that of the previous PCA map. Each could also be compared with the most recent large-scale Y2H study (Yu et al).

      • Experimental validation of the network was done by conventional PCA. However, it should be noted that this is a form of technical replication of the DHFR-based PCA assay, and not a truly independent validation. Other large-scale yeast interaction studies (e.g., Yu et al, Science 2008) have assessed a random subset of observed PPIs using an orthogonal approach, calibrated using PRS and RRS sets examined via the same orthogonal method, from which overall performance of the dataset could be determined.

      -The Venn diagram in Figure 1G was not very informative in terms of assessing the quality of data. It looks like there is a relatively little overlap between PPIs identified in standard conditions (SD media) in the current study and those of the previous study using a very similar method. Is there any way to know how much of this disagreement can be attributed to each screen being sub-saturation (e.g. by comparing replica screens) and what fraction to systematic assay or environment differences?

      -In Figure S5C, the environment-specificity rate of PPIs might be inflated due to the fact that authors only test for the absence of SD hits in other conditions, and the SD condition is the only condition that has been sampled twice during the screening. What would be the environment-specific verification rate if sample hits from each environment were tested in all environments? This seems important, as robustly detecting environment-specific PPIs is one of the key points of the study.

      Minor issues

      -Re: "An interaction between the proteins reconstitutes mDHFR, providing resistance to the drug methotrexate and a growth advantage that is proportional to the PPI abundance" (pg 2). It may be more accurate to say "monotonically related" than "proportional" here. Fig 2 from the cited Freschi et al ref does suggests linearity with colony size over a wide range of inferred complex abundances, but non-linear at low complex abundance. Also note that Freschi measured colony area which is not linear with exponential growth rate nor with cell count. -Re: "Using putatively positive and negative reference sets, we empirically determined astatistical threshold for each environment with the best balance of precision and recall (positive predictive value (PPV) > 61% in SD media, Methods, section 6)." (pg 3). Should state the recall at this PPV.

      -Authors could discuss the extent to which related methods (e.g. PMID: 28650476, PMID: 27107012, PMID: 29165646, PMID: 30217970) would be potentially suitable for screening in different environments.

      • the term "mutable" is certainly appropriate according to the dictionary definition of changeable. The authors may wish to consider though, that in a molecular biology context the term evokes changeability by mutation (a very interesting but distinct topic). Maybe another term (environment-dependent interactions or ePPIs?) would be clearer. Of course this is the authors' call.

      -Some discussion is warranted about the phenomenon that a PPI that is unchanged in abundance could appear to change because of statistical significance thresholds that differ between screens. This would be a difficult question for any such study, and I don't think the authors need to solve it, but just to discuss.

      -More discussion would be helpful about the idea that immutability may to some extent favor interactions that PCA is better able to detect (possibly including membrane proteins?)

      -Re: "As might be expected, we also found that mutable hubs, but not non-hubs, are more likely to participate in multiple protein complexes than less mutable proteins." (pg 6) This is a cool result. To what extent was this result driven by members of one or two complexes? If so, it would worth noting them.

      -Re: "Borrowing a species richness estimator from ecology (Jari Oksanen et al., 2019), we estimate that there are ~10,840 true interactions within our search space across all environments, ~3-fold more than are detected in SD (note difference to Figure 3, which counts observed PPIs)." (pg 8) Should note that this only allows estimation of the number of interactions that are detectable by PCA methods. Previous work (Braun et al, 2019) showed that every known protein interaction assay (including PCA approaches) can only detect a fraction of bona fide interactions.

      -Re: "This analysis shows that the number of PPIs present across all environments is much larger than the number observed in a single condition, but that it is feasible to discover most of these new PPIs by sampling a limited number of conditions." (pg 8). The main point is surely correct, but it is worth noting that extrapolation to the number of true interactions depends on the nine chosen environments being representative of all environments. The situation could change under more extreme, e.g., anaerobic, conditions.

      -It stands to reason that proteins expressed in all conditions will yield less mutable interactions, if 'mutability' is primarily due to expression change at the transcriptional level. They should at least discuss that measuring mRNA levels could resolve questions about this. Could use Waern et al G3 2013 data (H202, SD, HU, NaCl) to predict the dynamic interactome purely by node removal, and see how conclusions would change

      -The analysis showing that many interactions are likely due to post-translational modifications is very interesting, but caveats should be discussed. Where heterodimers do not fit the expression-level dependence model, some cases of non-fitting may simply be due to measurement error or non-linearity in the relationship between abundance and fitness.

      -Line numbers would have been helpful to note more specific minor comments

      -Sequence data should be shared via the Short-Read Archive.

      Significance

      Knowledge of protein-protein interactions (PPIs) provides a key window on biological mechanism, and unbiased screens have informed global principles underlying cellular organization. Several genome-scale screens for direct (binary) interactions between yeast proteins have been carried out, and while each has provided a wealth of new hypotheses, each has been sub-saturation. Therefore, even given multiple genome-scale screens our knowledge of yeast interactions remains incomplete. Different assays are better suited to find different interactions, and it is now clear that every assay evaluated thus far is only capable (even in a saturated screen) of detecting a minority of true interactions. More relevant to the current study, no binary interaction screen has been carried out at the scale of millions of protein pairs outside of a single 'baseline' condition.

      The study by Liu et al is notable from a technology perspective in that it is one of several recombinant-barcode approaches have been developed to multiplex pairwise combinations of two barcoded libraries. Although other methods have been demonstrated at the scale of 1M protein pairs, this is the first study using such a technology at the scale of >1M pairs across multiple environments.

      A limitation is that this study is not genome-scale, and the search space is biased towards proteins for which interactions were previously observed in a particular environment. This is perhaps understandable, as it made the study more tractable, but this does add caveats to many of the conclusions drawn. These would be acceptable if clearly described and discussed. There were also questions about data quality and assessment that would need to be addressed.

      Assuming issues can be addressed, this is a timely study on an important topic, and will be of broad interest given the importance of protein interactions and the status of S. cerevisiae as a key testbed for systems biology.

      Reviewers' expertise: Interaction assays, next-generation sequencing, computational genomics. Less able to assess evolutionary biology aspects.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Report on Liu et al. "A large accessory protein interactome is rewired across environments" Liu et al. use a mDHFR-based, pooled barcode sequencing / competitive growth / mild methotrexate selection method to investigate changes of PPI abundance of 1.6 million protein pairs across different 9 growth conditions. Because most PPI screens aim to identify novel PPIs in standard growth conditions, the currently known yeast PPI network may be incomplete. The key concept is to define immutable" PPIs that are found in all conditions and "mutable" PPIs that are present in only some conditions. The assay identified 13764 PPIs across the 9 conditions, using optimized fitness cut offs. Steady PPI i.e. across all environments, were identified in membrane compartments and cell division. Processes associated with the chromosome, transcription, protein translation, RNA processing and ribosome regulation were found to change between conditions. Mutable PPIs are form modules as topological analyses reveals.

      Interestingly, a correlation on intrinsic disorder and PPI mutability was found and postulated as more flexible in the conformational context, while at the same time they are formed by less abundant proteins.

      I appreciate the trick to use homodimerization as an abundance proxy to predict interaction between heterodimers (of proteins that homodimerize). This "mass-action kinetics model" explains the strength of 230 out of 1212 tested heterodimers.

      A validation experiment of the glucose transporter network was performed and 90 "randomly chosen" PPIs that were present in the SD environment were tested in NaCl (osmotic stress) and Raffinose (low glucose) conditions through recording optical density growth trajectories. Hxt5 PPIs stayed similar in the tested conditions, supported by the current knowledge that Hxt5 is highly expressed in stationary phase and under salt stress. In Raffinose, Hxt7, previously reported to increase the mRNA expression, lost most PPIs indicating that other factors might influence Hxt7 PPIs.

      Points for consideration:

      *) A clear definition of mutable and immutable is missing, or could not be found e.g. at page 4 second paragraph.

      *) Approximately half of the PPIs have been identified in one environment. Many of those mutable PPIs were detected in the 16{degree sign}C condition. Is there an explanation for the predominance of this specific environment? What are these PPIs about?

      *) 50 % overall retest validation rate is fair and reflects a value comparable to other large-scale approaches. However what is the actual variation, e.g. between mutable PPIs and immutable or between condition. e.g. at 16{degree sign}C.

      *) What is the R correlation cutoff for PPIs explained in the mass equilibrium model vs. not explained?

      *) 90 "randomly chosen" PPIs for validation. It needs to be demonstrated that these interaction are a random subset otherwise is could also mean cherry picked interactions ...

      *) Figure 4 provides interesting correlations with the goal to reveal properties of mutable and less mutable PPIs. PPIs detected in the PPIseq screen can partially be correlated to co-expression (4A) as well as co-localization. Does it make sense to correlate the co-expression across number of conditions? Are the expression correlation condition specific. In this graph it could be that expression correlation stems from condition 1 and 2 and the interaction takes place in 4 and 5 still leading to the same conclusion ... Is the picture of the co-expression correlation similar when you simply look at individual environments like in S4A?

      *) Figure 4C: Interesting, how dependent are the various categories?

      *) Figure 4 F: When binned in the number of environments in which the PPI was found, the distribution peaks at 6 environments and decreases with higher and lower number of environments. The description /explanation in the text clearly says something else.

      *) Figure 6: I apologize, but for my taste this is not a final figure 6 for this study. Investigation of different environments increases the PPI network in yeast, yes, yet it is very well known that a saturation is reached after testing of several conditions, different methods and even screening repetition (sampling). It does not represent an important outcome. Move to suppl or remove.

      Significance

      Liu et al. increase the current PPI network in yeast and offer a substantial dataset of novel PPIs seen in specific environments only. This resource can be used to further investigate the biological meaning of the PPI changes. The data set is compared to previous DHFR providing some sort of quality benchmarking. Mutable interactions are characterized well. Clearly a next step could be to start some "orthogonal" validation, i.e. beyond yeast growth under methotrexate treatment.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      The manuscript is clearly written and the figures appropriate and informative. Some descriptions of data analyses are a little dense but reflect what would appear long hard efforts on the part of the authors to identify and control for possible sources of misinterpretation due to sensitivities of parameters in their fitness model. The authors efforts to retest interactions under non-competition conditions allay fears of most concerns that I would have. One problem though that I could not see explicitly addressed was that of potential effects of interactions between methotrexate and the other conditions and how this is controlled for. Specifically, I could be argued that the fact that a particular PPI is observed under a specific condition could have more to do with a synthetic effect of treatment of cells with a drug plus methotrexate. Is this controlled for and how? I raise this because in a chemical genetic screen for fitness it was shown that methotrexate is particularly promiscuous for drug-drug interactions (Hillenmeyer ME ,et al. Science 2008). I tried to think of how this works but couldn't come up with anything immediately. I'd appreciate if the authors would take a crack at resolving this issue. Otherwise I have no further concerns about the manuscript.

      Significance

      Lui et al expand on previous work from the Levy group to explore a massive in vivo protein interactome in the yeast S. cerevisiae. They achieve this by performing screens cross 9 growth conditions, which, with replication, results in a total of 44 million measurements. Interpreting their results based on a fitness model for pooled growth under methotrexate selection, they make the key observation that there is a vastly expanded pool of protein-protein interactions (PPI) that are found under only one or two condition compared to a more limited set of PPI that are found under a broad set of conditions (mutable versus immutable interactors). The authors show that this dichotomy suggests some important features of proteins and their PPIs that raise important questions about functionality and evolution of PPIs. Among these are that mutable PPIs are enriched for cross-compartmental, high disorder and higher rates of evolution and subcellular localization of proteins to chromatin, suggesting roles in gene regulation that are associated with cellular responses to new conditions. At the same time these interactions are not enriched for changes in abundance. These results are in contrast to those of immutable PPIs, which seem to form a core background noise, more determined by changes in abundance than what the authors interpret must be post-translational processes that may drive, for instance, changes in subcellular localization resulting in appearance of PPIs under specific conditions. The authors are also able to address a couple of key issues about protein interactomes, including the controversial Party-date Hub hypothesis of Vidal, in which they could now affirm support for this hypothesis based on their results and notably negative correlation of PPIs to protein abundance for mutable PPIs. Finally, they also addressed the problem of predicting the upper limit of PPIs in yeast, showing the remarkable results that it may be no more than about 2 times the number of proteins expressed by yeast. Such an upper limit is profoundly important to modelling cellular network complexity and, if it holds up, could define a general upper limit on organismal complexity.

      This manuscript is a very important contribution to understanding dynamics of molecular networks in living cells and should be published with high priority.
      
    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      We thank the reviewers for their close reading and constructive comments on our manuscript. We believe that their insight has substantially strengthened our manuscript. Please find our response/revision plan for each comment below (in blue). Note, because of the substantial changes to the figures and the additional experiments that are we are undertaking, we have not initially revised the text. The proposed textual revisions will be included in the full revision.

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      The Katz lab has contributed greatly to the field of epigenetic reprogramming over the years, and this is

      another excellent paper on the subject. I enjoyed reviewing this manuscript and don't have any major

      comments/suggestions for improving it. The findings presented are novel and important, the results are clear

      cut, and the writing is clear.

      It's important to stress the novelty of the findings, which build upon previous studies from the same lab (upon

      a shallow look one might think that some of the conclusions were described before, but this is not the case).

      Despite the fact that this system has been studied in depth before, it remained unclear why and how

      germline genes are bookmarked by H3K36 in the embryo, and it wasn't known why germline genes are not

      expressed in the soma.

      To study these questions Carpenter et al. examine multiple phenotypes (developmental aberrations,

      sterility), that they combine with analysis of multiple genetic backgrounds, RNA-seq, CHIP-seq, single

      molecule FISH, and fluorescent transgenes.

      Previous observations from the Katz lab suggested that progeny derived from spr-5;met-2 double mutants

      can develop abnormally. They show here that the progeny of these double mutants (unlike spr-5 and met-2

      single mutants) develop severe and highly penetrate developmental delays, a Pvl phenotype, and sterility.

      They show also that spr-5; met-2 maternal reprogramming prevents developmental delay by restricting

      ectopic MES-4 bookmarking, and that developmental delay of spr-5;met-2 progeny is the result of ectopic

      expression of MES-4 germline genes. The bottom line is that they shed light on how SPR-5, MET-2 and

      MES-4 balance inter-generational inheritance of H3K4, H3K9, and H3K36 methylation, to allow correct

      specification of germline and somatic cells. This is all very important and relevant also to other organisms.

      **(very) Minor comments:**

      -Since the word "heritable" is used in different contexts, it could be helpful to elaborate, perhaps in the

      introduction, on the distinction between cellular memory and transgenerational inheritance.

      We are happy to elaborate on this in the revised manuscript.

      -It might be interesting in the Discussion to expand further about the links between heritable chromatin

      marks and heritable small RNAs. The do hint that the result regarding the silencing of the somatic transgene

      are especially intriguing.

      We are happy to expand this in the revised manuscript.

      Reviewer #1 (Significance (Required)):

      This is an exciting paper which build upon years of important work in the Katz lab. The novelty of the paper

      is in pinpointing the mechanisms that bookmark germline genes by H3K36 in the embryo, and explaining

      why and how germline genes are prevented from being expressed in the soma.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      Katz and colleagues examine the interaction between the methyltransferase MES-4 and spr-5; met-2 double

      mutants. Their prior analysis (PNAS, 2014) showed the dramatic enhancement in sterility and development

      for spr-5; met-2; this paper extends that finding by showing these effects depend on MES-4. The results are

      interesting and the genetic interactions dramatic. The examination by RNAseq and ChIP helps move the

      phenotypes into a more molecular analysis. The authors hypothesize that SPR-5 and MET-2 modify

      chromatin of germline genes (MES-4 targets) in somatic cells, and this is required to silence germline genes

      in the soma. A few issues need to be resolved to test these ideas and rule out others.

      **Main comments:**

      The authors' hypothesis is that SPR-5 and MET-2 act directly, to modify chromatin of germline genes (MES-

      4 targets), but alternate hypothesis is that the key regulated genes are i) MES-4 itself and/or ii) known

      regulators of germline gene expression e.g. the piwi pathway. Mis regulation of these factors in the soma

      could be responsible for the phenotypes. Therefore, the authors should analyze expression (smFISH and

      where possible protein stains) for MES-4 and PIWI components in the embryo and larvae of wildtype, double

      and triple mutant strains. These experiments are essential and not difficult to perform.

      In our RNA-seq analysis we see a small elevation of MES-4 itself (average 1.18 log2 fold change across 5 replicates). This does not seem likely to be solely driving such a dramatic phenotype. Nevertheless, it is possible that the small increase in expression of MES-4 itself could be contributing. To determine if MES-4 is being ectopically expressed in spr-5; met-2 double mutants, we have obtained a tag version of MES-4 from Dr. Susan Strome and will use this to examine the localization of MES-4 protein in spr-5; met-2 double mutants. We are definitely interested in the potential interaction between PIWI components and the histone modifying enzymes that we have explored in this study. However, since RNAi of MES-4 is sufficient to rescue the developmental delay of spr-5; met-2 mutants, we have chosen to focus on that interaction in this paper. In the future, we hope to examine the role of PIWI components in this system.

      A second aspect of the hypothesis is that spr-5 and met-2 act before mes-4 and that while these genes are

      maternally expressed, they act in the embryo. There really aren't data to support these ideas - the timing and

      location of the factors' activities have not been pinned down. One way to begin to address this question

      would be to perform smFISH on the target genes and on mes-4 in embryos and determine when and where

      changes first appear. smFISH in embryos is critical - relying on L1 data is too late. If timing data cannot be

      obtained, then I suggest that the authors back off of the timing ideas or at least explain the caveats.

      Certainly, figure 8 should be simplified and timing removed. (note: Typical maternal effect tests probably

      won't work because if the genes' RNAs are germline deposited, then a maternal effect test will reflect when

      the RNA is expressed but not when the protein is active. A TS allele would be needed, and that may not be

      available.)

      To determine the timing of the ectopic expression of MES-4 targets, we have performed smFISH on two MES-4 targets in embryos. Thus far, these experiments show that MES-4 targets are ectopically expressed in the embryo, but only after the maternal to zygotic transition. This is consistent with our proposed model. A figure containing this data will be added to the revised manuscript. In addition, our model is predicated on the known embryonic protein localization of SPR-5 and MES-4. Maternal SPR-5 protein is present in the early embryo up to around the 8-cell stage, but absent in later embryos (Katz et al., 2009). In addition, in mice, the SPR-5 ortholog LSD1 is required maternally prior to the 2-cell stage (Wasson et al., 2016 and Ancelin et al., 2016). In contrast, MES-4 continues to be expressed in the embryo until later embryonic stages where it is concentrated into the germline precursors Z2 and Z3 (Fong et al., 2002). This is consistent with SPR-5 establishing a chromatin state that continues to be antagonized by MES-4. There is evidence that MET-2 is expressed both in early embryos and later embryos. However, since the phenotype of MET-2 so closely resembles the phenotype of SPR-5 (Kerr et al., 2014), we have included it in our model as working with SPR-5. Further experimentation will be required to substantiate the model, but we believe the model is consistent with all of the current data.

      Writing/clarity:

      -It would be helpful to include a table that lists the specific genes studied in the paper and how they behaved

      in the different assays e.g. RNAseq 1, RNAseq 2, MES-4 target, ChIP. That way, readers will understand

      each of the genes better.

      We are happy to include a table in the revised manuscript.

      -At the end of each experiment, it would be helpful to explain the conclusion and not wait until the

      Discussion. For readers not in the field, the logic of the Results section is hard to follow.

      This seems like a stylistic choice. Traditionally, papers did not include any conclusions in the results section, and it is our preference to keep our paper organized this way. However, if the reviewer would still like us to change this, we are happy to do so.

      -The model is explained over three pages in the Discussion. It would be great to begin with a single

      paragraph that summarizes the model/point of the paper simply and clearly.

      The discussion in the revised manuscript will altered to include this.

      **Specific comments:**

      -Figure 1 has been published previously and should be moved to the supplement.

      In our original paper (Kerr et al.) we reported in the text that spr-5; met-2 mutants have a developmental delay. However, we did not characterize this developmental delay. Nor did we include any images of the double mutants, except for one image of the adult germline phenotype. As a result, we believe that the inclusion of the developmental delay in the main body of this manuscript is warranted.

      -Cite their prior paper for the vulval defects e.g. page 6 or show in supplement.

      We are happy to include a citation of our previous paper for the vulval defects in the revised manuscript.

      -The second RNAseq data should be shown in the Results since it is much stronger. The first RNAseq,

      which is less robust, should be moved to supplement.

      The revised manuscript will include this alteration.

      -Figure 3 is very nice. Please explain why the RNAs were picked (+ the table, see comment above), and

      please add here or in a new figure mes-4 and piwi pathway expression data in wildtype vs double/triple

      mutants.

      We performed RT-PCR on 9 MES-4 targets. These 9 targets were picked because they had the highest ectopic expression in spr-5; met-2 mutants and largest change in H3K36me3 in spr-5; met-2 mutants versus Wild Type. Amongst these 9 genes, we performed smFISH on htp-1 and cpb-1 because they are relatively well characterized as germline genes.

      The revised manuscript will include added panels to supplemental figure 2 showing the expression of PIWI pathway components.

      -Figure 3 here or later, please show if mes-4 RNAi removes somatic expression of target genes.

      We are currently carrying out this experiment. Once it is completed, the data will hopefully be added to the paper.

      -Is embryogenesis delayed?

      Embryogenesis seems to be sped up in spr-5; met-2 mutants. A supplemental figure will be added to the revised manuscript showing this. It is unclear why embryogenesis is sped up. However, this confirms that the developmental delay is unique to the L1/L2 stages.

      -Figure 4 since htp-1 smFISH is so dramatic, it would be helpful to include htp-1 in the lower panels.

      htp-1 will be added to the lower panels in the revised manuscript.

      -Figure 4, please add an extra 2 upper panels showing all the genes in N2 vs spr-5;met-2, for comparison to

      the mes-4 cohort.

      As a control, we will add panels showing a comparison to all germline genes, excluding MES-4 targets. This new data shows that germline genes that are not MES-4 targets do not have ectopic H3K36me3. This data, which further suggests that the phenomenon is confined to MES-4 targets, is consistent with our results showing that MES-4 RNAi is sufficient to suppress the developmental delay.

      -Figure 6. Please show a control that met-1 RNAi is working.

      We performed RT-PCR to try and confirm that met-1 RNAi was working. Despite controls repeating the MES-4 suppression and verifying that RNAi was working, we were unable to demonstrate that met-1 was knocked down. As a result, we will remove this result from the paper. Importantly, this does not affect the conclusion of the paper.

      -To quantify histone marks more clearly, it would be wonderful to have a graph of the mean log across the

      gene. showing the mean numbers would help clarify the degree of the effect. we had an image as an

      example but it does not paste into the reviewer box. Instead, see figure 2 or figure 4

      here: https://www.nature.com/articles/ng.322

      We will attempt to include this analysis in the revised manuscript.

      Reviewer #2 (Significance (Required)):

      Katz and colleagues examine the interaction between the methyltransferase MES-4 and spr-5; met-2 double

      mutants. Their prior analysis (PNAS, 2014) showed the dramatic enhancement in sterility and development

      for spr-5; met-2; this paper extends that finding by showing these effects depend on MES-4. The results are

      interesting and the genetic interactions dramatic. The examination by RNAseq and ChIP helps move the

      phenotypes into a more molecular analysis.

      This work will be of interest to people following transgenerational inheritance, generally in the C. elegans

      field. People using other organisms may read it also, although some of the worm genetics may be

      complicated. Some of the writing suggestions could make a difference.

      I study C. elegans embryogenesis, chromatin and inheritance.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      In the paper entitled "C. elegans establishes germline versus soma by balancing inherited histone

      methylation" Carpenter BS et al examined a double mutant worm strain they had previously produced of the

      H3K4me1/2 demethylase spr-5 and the predicted H3K9me1/me2 methylase met-2. These mutant worms

      have a developmental delay that arises by the L2 larval stage. They performed an analysis of what genes

      get misexpressed in these double mutants by performing RNAseq and compare this to datasets generated

      from other labs on an H3K36me2/me3 methylase MES-4 where they see a high degree of overlap. They

      validate the misexpression of some germline specific genes in the soma by in situ and validate that there is a

      dysregulation of H3K36me3 in their double mutant worms. They further find that knocking down mes-4

      reverts the developmental delay.

      I think that the authors need to make more of an effort to be a bit more scholarly in terms of placing their

      work in the context of the field as a whole and also need to add a few additional experiments as well as

      reorganize a bit before this is ready for publication. Remember that the average reader is not necessarily an

      expert in C. elegans or this particular field and you really want to try and make the manuscript as accessible

      to everyone as possible.

      **Major Points**

      1)It would be good to see western blots or quantitative mass spec examining H3K36me3 in the WT and spr-

      5;met-2 double mutant worms. I believe this was also previously reported by Greer EL et al Cell Rep 2014 in

      the single spr-5 mutant worm so that work should be cited here in addition to the identification of JMJD-2 as

      an enzyme involved in the inheritance of H3K4me2 phenotype.

      The ectopic H3K36me3 is confined to a small set of MES-4 targets. We don’t even see ectopic H3K36me3 at non-MES-4 germline genes (see above). Therefore, we don’t expect to see any global differences in bulk H3K36me3. Greer et al reported that there are elevated H3K36me3 levels in spr-5 mutants. This discrepancy may be due to different stages (embryos, germline) present in their bulk preparation. Alternatively, the met-2 mutant may counteract the effect of the spr-5 mutation on H3K36me3. Regardless, we believe that the genome-wide ChIP-seq is more informative than bulk H3K36me3 levels.

      We will add a citation for the Greer paper in the revised manuscript.

      2)Missing from Fig.5 is mes-4 KD by itself. This is needed to determine whether these effects are specific to

      the spr-5;met-2 double mutants or more general effects that KD of mes-4 would decrease the expression of

      all these genes to a similar extent. Then statistics should be done to see if the decrease in the WT context is

      the same or greater than the decrease in the double mutants.

      The MES-4 targets are generally expressed only in the germline and defined by having mes-4 dependent H3K36me3. Knocking down mes-4 would be expected to prevent the expression of these genes in the germline, but this is difficult to test because mes-4 mutants basically don’t make a germline. Regardless, knocking down mes-4 by itself would only assess the role of MES-4 in germline transcription, not the ectopic expression that is being assayed in spr-5; met-2 mutants in Fig 5. Importantly, it remains possible that spr-5; met-2 mutants might also result in an increase in the expression of MES-4 targets in the germline. However, the experiments performed in this manuscript were conducted on L1 larvae, which do not have any germline expression, to eliminate this potential confounding contribution.

      **Minor Points**

      1)A greater attempt needs to be made to be more scholarly for citing previously published literature. This

      includes work on the inheritance of H3K27 and H3K36 methylation in C. elegans and other species as well.

      A few papers which seem germane to this story which should be cited in the intro are (Nottke AC et al PNAS

      2011, Gaydos LJ et al Science 2014, Ost A et al Cell 2014, Greer EL et al Cell Rep 2014, Siklenka K et al

      Science 2015, Tabuchi TM et al Nat Comm 2018, Kaneshiro KR et al Nat Comm 2019). This problem is not

      restricted to the intro.

      Although many of these excellent papers are broadly relevant to this current work, they are not necessarily directly relevant to this paper. For this reason, they were not originally cited. Nevertheless, we will attempt to cite these papers in the revised version when possible.

      2)I think that the authors need to be a little less definitive with your language. Theories should be introduced

      as possibilities rather than conclusions. Should remove "comprehensive" from intro as there are many other

      methods which could be done to test this.

      Throughout the manuscript, we have tried to be clear what the data suggests versus what is model based on the data. Nevertheless, to further clarify this, we are happy to remove “comprehensive” from the intro.

      3)The authors should describe what PIE-1 is. Is this a transcription factor?

      PIE-1 is a transcriptional inhibitor that is thought to block RNA polII elongation by mimicking the CTD of RNA polII and competing for phosphorylation. We are happy to add a reference to this function in the revised manuscript.

      4)The language needs clarification about MES-4 germline genes and bookmark genes. Are these bound by

      MES-4 or marked with K36me2/3?

      The revised manuscript will be modified to make this definition more clear.

      5)I think Fig S1 E+F should be in the main figure 1 so readers can see the extent of the phenotype.

      The original single image of the spr-5; met-2 adult germline phenotype (including the protruding vulva) was included in our previous publication. In this manuscript, we have now quantified this phenotype, which is why it is included in the supplement here. However, because the original picture was included in our original publication, we prefer to leave it as supplemental.

      6)For Fig S2 it would be good to do the same statistics that is done in Fig 2 and mention them in the text so

      the readers can see that the overlap is statistically significant.

      We are happy to include these statistics in the revised manuscript.

      7)Fig S2.2 should be yellow blue rather than red green for the colorblind out there.

      Thanks for pointing this out. We are happy to change the colors in the revised manuscript.

      8)When saying "Many of these genes involved in these processes..." the authors need to include numbers

      and statistics.

      We will amend the revised text to make the definition of the MES-4 genes more clear.

      9)Should use WT instead of N2 and specify what wildtype is in methods.

      We will use WT instead of N2 in the revised manuscript.

      10)Fig. 2A + B could be displayed in a single figure. And Fig 2D seems superfluous and could be combined

      with 2C or alternatively it could be put in supplementary.

      Figure 2A and 2B were purposely separated to make it clear how many of the overlapped changes are up versus down. In the revised manuscript, Figure

      2D will be moved to the supplement.

      11)Non-C. elegans experts won't understand what balancers are. An effort should be made to make this

      accessible to all. Explaining when genes are heterozygous or homozygous mutants seems relevant

      here.

      The text of the revised manuscript will be amended to make it more accessible for non-C. elegans readers.

      12)The GO categories (Fig. S2) should be in the main figure and need to be made to look more scientific

      rather than copied and pasted from a program.

      The GO categories were included to be comprehensive and do not contribute substantially to the main conclusion of the paper. This is why they are supplemental. In the revised manuscript, we will edit the GO results so that they look more scientific.

      13)Fig. 7 seems a bit out of place. If the authors were to KD mes-4 and similarly show that the phenotype

      reverts that would help justify its inclusion in this paper. Without it seems like a bit of an add on that belongs

      elsewhere.

      We believe that the somatic expression of a transgene in spr-5; met-2 mutants adds to our potential understanding of how this double mutant may lead to developmental delay. This is true, regardless of whether of whether the somatic transgene expression is mes-4 dependent or not.

      Reviewer #3 (Significance (Required)):

      I think this is an interesting and timely piece of work. A little more effort needs to be put in to make sure it is

      accessible to the average reader and has sufficient inclusion of more of the large body of work on

      inheritance of histone modifications. I think C. elegans researchers as well as people interested in

      inheritance and the setup of the germline will be interested in this work.

      REFEREES CROSS COMMENTING

      I agree with Reviewer #2's comments on experiments to include or exclude alternative models. I also agree

      about their statement about rewriting to make it more accessible to others who aren't experts in this

      specialized portion of C. elegans research. All in all it seems like the experiments which are required by

      reviewer #2 and myself as well as the rewriting should be quite feasible.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      In the paper entitled "C. elegans establishes germline versus soma by balancing inherited histone methylation" Carpenter BS et al examined a double mutant worm strain they had previously produced of the H3K4me1/2 demethylase spr-5 and the predicted H3K9me1/me2 methylase met-2. These mutant worms have a developmental delay that arises by the L2 larval stage. They performed an analysis of what genes get misexpressed in these double mutants by performing RNAseq and compare this to datasets generated from other labs on an H3K36me2/me3 methylase MES-4 where they see a high degree of overlap. They validate the misexpression of some germline specific genes in the soma by in situ and validate that there is a dysregulation of H3K36me3 in their double mutant worms. They further find that knocking down mes-4 reverts the developmental delay.

      I think that the authors need to make more of an effort to be a bit more scholarly in terms of placing their work in the context of the field as a whole and also need to add a few additional experiments as well as reorganize a bit before this is ready for publication. Remember that the average reader is not necessarily an expert in C. elegans or this particular field and you really want to try and make the manuscript as accessible to everyone as possible.

      Major Points

      1)It would be good to see western blots or quantitative mass spec examining H3K36me3 in the WT and spr-5;met-2 double mutant worms. I believe this was also previously reported by Greer EL et al Cell Rep 2014 in the single spr-5 mutant worm so that work should be cited here in addition to the identification of JMJD-2 as an enzyme involved in the inheritance of H3K4me2 phenotype.

      2)Missing from Fig.5 is mes-4 KD by itself. This is needed to determine whether these effects are specific to the spr-5;met-2 double mutants or more general effects that KD of mes-4 would decrease the expression of all these genes to a similar extent. Then statistics should be done to see if the decrease in the WT context is the same or greater than the decrease in the double mutants.

      Minor Points

      1)A greater attempt needs to be made to be more scholarly for citing previously published literature. This includes work on the inheritance of H3K27 and H3K36 methylation in C. elegans and other species as well. A few papers which seem germane to this story which should be cited in the intro are (Nottke AC et al PNAS 2011, Gaydos LJ et al Science 2014, Ost A et al Cell 2014, Greer EL et al Cell Rep 2014, Siklenka K et al Science 2015, Tabuchi TM et al Nat Comm 2018, Kaneshiro KR et al Nat Comm 2019). This problem is not restricted to the intro.

      2)I think that the authors need to be a little less definitive with your language. Theories should be introduced as possibilities rather than conclusions. Should remove "comprehensive" from intro as there are many other methods which could be done to test this.

      3)The authors should describe what PIE-1 is. Is this a transcription factor?

      4)The language needs clarification about MES-4 germline genes and bookmark genes. Are these bound by MES-4 or marked with K36me2/3?

      5)I think Fig S1 E+F should be in the main figure 1 so readers can see the extent of the phenotype.

      6)For Fig S2 it would be good to do the same statistics that is done in Fig 2 and mention them in the text so the readers can see that the overlap is statistically significant.

      7)Fig S2.2 should be yellow blue rather than red green for the colorblind out there.

      8)When saying "Many of these genes involved in these processes..." the authors need to include numbers and statistics.

      9)Should use WT instead of N2 and specify what wildtype is in methods.

      10)Fig. 2A + B could be displayed in a single figure. And Fig 2D seems superfluous and could be combined with 2C or alternatively it could be put in supplementary.

      11)Non-C. elegans experts won't understand what balancers are. An effort should be made to make this accessible to all. Explaining when genes are heterozygous or homozygous mutants seems relevant here.

      12)The GO categories (Fig. S2) should be in the main figure and need to be made to look more scientific rather than copied and pasted from a program.

      13)Fig. 7 seems a bit out of place. If the authors were to KD mes-4 and similarly show that the phenotype reverts that would help justify its inclusion in this paper. Without it seems like a bit of an add on that belongs elsewhere.

      Significance

      I think this is an interesting and timely piece of work. A little more effort needs to be put in to make sure it is accessible to the average reader and has sufficient inclusion of more of the large body of work on inheritance of histone modifications. I think C. elegans researchers as well as people interested in inheritance and the setup of the germline will be interested in this work.

      REFEREES CROSS COMMENTING

      I agree with Reviewer #2's comments on experiments to include or exclude alternative models. I also agree about their statement about rewriting to make it more accessible to others who aren't experts in this specialized portion of C. elegans research. All in all it seems like the experiments which are required by reviewer #2 and myself as well as the rewriting should be quite feasible.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Katz and colleagues examine the interaction between the methyltransferase MES-4 and spr-5; met-2 double mutants. Their prior analysis (PNAS, 2014) showed the dramatic enhancement in sterility and development for spr-5; met-2; this paper extends that finding by showing these effects depend on MES-4. The results are interesting and the genetic interactions dramatic. The examination by RNAseq and ChIP helps move the phenotypes into a more molecular analysis. The authors hypothesize that SPR-5 and MET-2 modify chromatin of germline genes (MES-4 targets) in somatic cells, and this is required to silence germline genes in the soma. A few issues need to be resolved to test these ideas and rule out others.

      Main comments:

      The authors' hypothesis is that SPR-5 and MET-2 act directly, to modify chromatin of germline genes (MES-4 targets), but alternate hypothesis is that the key regulated genes are i) MES-4 itself and/or ii) known regulators of germline gene expression e.g. the piwi pathway. Mis regulation of these factors in the soma could be responsible for the phenotypes. Therefore, the authors should analyze expression (smFISH and where possible protein stains) for MES-4 and PIWI components in the embryo and larvae of wildtype, double and triple mutant strains. These experiments are essential and not difficult to perform.

      A second aspect of the hypothesis is that spr-5 and met-2 act before mes-4 and that while these genes are maternally expressed, they act in the embryo. There really aren't data to support these ideas - the timing and location of the factors' activities have not been pinned down. One way to begin to address this question would be to perform smFISH on the target genes and on mes-4 in embryos and determine when and where changes first appear. smFISH in embryos is critical - relying on L1 data is too late. If timing data cannot be obtained, then I suggest that the authors back off of the timing ideas or at least explain the caveats. Certainly, figure 8 should be simplified and timing removed. (note: Typical maternal effect tests probably won't work because if the genes' RNAs are germline deposited, then a maternal effect test will reflect when the RNA is expressed but not when the protein is active. A TS allele would be needed, and that may not be available.)

      Writing/clarity:

      -It would be helpful to include a table that lists the specific genes studied in the paper and how they behaved in the different assays e.g. RNAseq 1, RNAseq 2, MES-4 target, ChIP. That way, readers will understand each of the genes better.

      -At the end of each experiment, it would be helpful to explain the conclusion and not wait until the Discussion. For readers not in the field, the logic of the Results section is hard to follow.

      -The model is explained over three pages in the Discussion. It would be great to begin with a single paragraph that summarizes the model/point of the paper simply and clearly.

      Specific comments:

      -Figure 1 has been published previously and should be moved to the supplement.

      -Cite their prior paper for the vulval defects e.g. page 6 or show in supplement.

      -The second RNAseq data should be shown in the Results since it is much stronger. The first RNAseq, which is less robust, should be moved to supplement.

      -Figure 3 is very nice. Please explain why the RNAs were picked (+ the table, see comment above), and please add here or in a new figure mes-4 and piwi pathway expression data in wildtype vs double/triple mutants.

      -Figure 3 here or later, please show if mes-4 RNAi removes somatic expression of target genes.

      -Is embryogenesis delayed?

      -Figure 4 since htp-1 smFISH is so dramatic, it would be helpful to include htp-1 in the lower panels.

      -Figure 4, please add an extra 2 upper panels showing all the genes in N2 vs spr-5;met-2, for comparison to the mes-4 cohort.

      -Figure 6. Please show a control that met-1 RNAi is working.

      -To quantify histone marks more clearly, it would be wonderful to have a graph of the mean log across the gene. showing the mean numbers would help clarify the degree of the effect. we had an image as an example but it does not paste into the reviewer box. Instead, see figure 2 or figure 4 here: https://www.nature.com/articles/ng.322

      Significance

      Katz and colleagues examine the interaction between the methyltransferase MES-4 and spr-5; met-2 double mutants. Their prior analysis (PNAS, 2014) showed the dramatic enhancement in sterility and development for spr-5; met-2; this paper extends that finding by showing these effects depend on MES-4. The results are interesting and the genetic interactions dramatic. The examination by RNAseq and ChIP helps move the phenotypes into a more molecular analysis.

      This work will be of interest to people following transgenerational inheritance, generally in the C. elegans field. People using other organisms may read it also, although some of the worm genetics may be complicated. Some of the writing suggestions could make a difference.

      I study C. elegans embryogenesis, chromatin and inheritance.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      The Katz lab has contributed greatly to the field of epigenetic reprogramming over the years, and this is another excellent paper on the subject. I enjoyed reviewing this manuscript and don't have any major comments/suggestions for improving it. The findings presented are novel and important, the results are clear cut, and the writing is clear.

      It's important to stress the novelty of the findings, which build upon previous studies from the same lab (upon a shallow look one might think that some of the conclusions were described before, but this is not the case). Despite the fact that this system has been studied in depth before, it remained unclear why and how germline genes are bookmarked by H3K36 in the embryo, and it wasn't known why germline genes are not expressed in the soma.

      To study these questions Carpenter et al. examine multiple phenotypes (developmental aberrations, sterility), that they combine with analysis of multiple genetic backgrounds, RNA-seq, CHIP-seq, single molecule FISH, and fluorescent transgenes.

      Previous observations from the Katz lab suggested that progeny derived from spr-5;met-2 double mutants can develop abnormally. They show here that the progeny of these double mutants (unlike spr-5 and met-2 single mutants) develop severe and highly penetrate developmental delays, a Pvl phenotype, and sterility. They show also that spr-5; met-2 maternal reprogramming prevents developmental delay by restricting ectopic MES-4 bookmarking, and that developmental delay of spr-5;met-2 progeny is the result of ectopic expression of MES-4 germline genes. The bottom line is that they shed light on how SPR-5, MET-2 and MES-4 balance inter-generational inheritance of H3K4, H3K9, and H3K36 methylation, to allow correct specification of germline and somatic cells. This is all very important and relevant also to other organisms.

      (very) Minor comments:

      -Since the word "heritable" is used in different contexts, it could be helpful to elaborate, perhaps in the introduction, on the distinction between cellular memory and transgenerational inheritance.

      -It might be interesting in the Discussion to expand further about the links between heritable chromatin marks and heritable small RNAs. The do hint that the result regarding the silencing of the somatic transgene are especially intriguing.

      Significance

      This is an exciting paper which build upon years of important work in the Katz lab. The novelty of the paper is in pinpointing the mechanisms that bookmark germline genes by H3K36 in the embryo, and explaining why and how germline genes are prevented from being expressed in the soma.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      We thank the three reviewers for providing insightful critiques on our manuscript.

      Changes to document and comments made are marked e.g. “Reply 1.1” (referring the Reviewer #1 item #1, etc.) as described below.

      Reviewer #1

      I found this study to be very convincing. Prior studies are referenced appropriately, the text is well written and clear, the figures are clear also. In my opinion the paper does not need further experiment.

      [1.1] The conclusions are well supported by the data. However, the concatenation model seems very speculative at this point. Also, it does not take into account the dynamics of these molecules.

      Reply 1.1: The concatenation model combines the structural data from our manuscript with prior biochemical insights into tetraspanin homodimerization and with scanning-EM data on immunogold-labeled CD81 and CD9 on cells. It is not completely clear to us what reviewer #1 refers to with “the dynamics of these molecules”. The cryo-EM data revealed that CD9 - EWI-F is a dynamic complex with straight and bent conformations, which could account for both circular and linear arrangements of tetraspanin-microdomains in cell membranes through the higher-order oligomerization of stable CD9 - EWI-F tetramers. Moreover, transient CD9 - CD9 interactions likely yield a variable number of complexes present in these concatenated and flexible strings of complexes. Such a concatenation model indeed requires further validation. However, it is consistent with experimental data and, importantly, provides a long-awaited molecular basis for TEM assembly. Although it was not within the scope of the current study, it will be of great interest to further investigate the concatenation model through detailed cell-biology based approaches.

      **Minor comment:**

      [1.2] There seems to be a mix up between the two structures in the following sentence p4: "In CD9EC2 - 4C8, the D loop adopts a partially helical conformation and central residue F176 is sandwiched by 4E8 residues W59 of CDR2 and W102 and R105 of CDR3 (Fig. 1D). In the 4C8-bound CD9EC2 structure the tip of the D loop points more outward and the Cα atom of F176"

      Reply 1.2: The first sentence indeed mixed up the two structures and wrongfully mentioned CD9EC2 - 4C8 instead of CD9EC2 - 4E8. This has now been updated: “In CD9EC2 - 4E8, the D loop adopts …”

      Reviewer #2

      The paper is well written and the conclusions made are supported by the data presented.

      [2.1] The ternary structure is in agreement with that of CD9 in complex with the related EWI-2 published earlier this year by Umeda et al (ref #25). The present work thus adds little structural insights but may be useful in showing that the interaction pattern seen extends to another EWI protein family member.

      Reply 2.1: We agree with reviewer #2 that that the CD9 - EWI-F structure presented in our work is similar to the CD9 - EWI-2 structure published recently by Umeda et al. (ref #25). However, as also pointed out by reviewer #1, we believe that the CD9 - EWI-F structure adds new important information to understand the molecular mechanism underlying the assembly of tetraspanin-enriched microdomains. Notably, the different conformations of the CD9 - EWI-F complex observed in the cryo-EM data provide structural biology evidence for the dynamic nature of the interaction between a tetraspanin and a partner protein, which is consistent with a wealth of prior biochemical data. Guided by the distinct shape of the CD9EC2 - 4C8 densities, we were able to distinguish a range of straight to bent conformations of the complex. CD9 regions that represent known tetraspanin homo-dimerization sites, orient away from EWI-F and are available for interactions. Thus, combining our structural data with previous biochemical interaction data allowed for the generation of a long-awaited model for the assembly of tetraspanin-microdomains at the molecular level. We believe that these implications for TEM assembly will stimulate new, innovative research into the molecular principles that govern the function of tetraspanins.

      [2.2] As such it may be acceptable for publication. In this case, the authors should improve the quality of Figs. 3D and 4D.

      Reply 2.2: Figures 3D and 4D depict raw cryo-electron microscopy images (micrographs). The protein complexes imaged in this study only contain light atoms (H, N, C, O, S). Therefore, the collected micrographs only reveal low-contrast images of protein particles, and, for a typical cryo-EM experiment, it is required to average particles from thousands of micrographs to obtain a 3-dimensional reconstruction. We would like to keep the raw micrographs in figures 3 and 4, as it will aid cryo-EM scientists in judging the quality of the data.

      Reviewer #3

      The work is technically well performed and clearly presented including methodological details. I just have a few minor comments:

      [3.1] Page 4 and Figure S1: it is hard to see how a reliable affinity for 4E8 can be obtained from the cell binding data in S1A, as there is no indication of saturation. It would be good to at acknowledge that this is at best a rough estimate. Fortunately the data for this nanobody in purified situation seems solid.

      Reply 3.1: The obtained affinities are indeed an ±estimation based on a non-linear regression curve fitting on the measured data, performed in triplicate. The text has been updated and now reads as “4C8 and 4E8 bind to purified, full-length CD9 as well as to endogenous CD9 expressed on HeLa cells with apparent binding affinities in the nanomolar range (Fig. S1A, B, C)”. Next to that, a table stating the calculated KDs has been included as Fig. S1C.

      [3.2] Page 6: Does the absence of micellar density for the EWI-F complex indicate flexibility of the extracellular domain relative to the TM? Does this happen because the classification focuses on the highly elongated Ig region?

      Reply 3.2: These are indeed plausible assumptions. We observed highly heterogeneous, elongated particles in the micrograph shown in Fig. 3D, indicating inter-domain flexibility. If the alignment software focusses on certain Ig-like domains, other regions of the protein complex will be averaged out. An additional complexity with these elongated particles was to select an appropriate box size for particle picking and particle extraction, because the particles differ greatly in size based on their orientation (fully elongated side-views vs. much smaller top-views). When taken together, the complex of CD9 with full-length EWI-F was unsuitable for high-resolution structure determination; the subsequent strategy using EWI-FΔIg1-5 resulted in globular particles with less flexibility (Fig. 4D), which allowed for a more detailed structural characterization of the complex.

      [3.3] Page 8: "Recently, a cryo-EM density map has been reported..." - please reference here.

      Reply 3.3: We added the appropriate reference to the sentence: “Recently, a cryo-EM density map has been reported of CD9 in complex with an EWI-F homolog, EWI-2 (25).”

      [3.4] Relatively little is known about how tetraspanins help to organize partner receptors into defined membrane domains, evidence for which has emerged from super-resolution light microscopy. Based on their structural analysis of the CD9-EWI-F complex, including the heterogeneity apparent in the cryo-EM structure, they propose a feasible concatenation model for higher order oligomerization of these complexes in the membrane. Obviously the model will need to be tested rigorously by mutational analysis, particularly the EWI Ig6 interface, but as it stands the paper is a significant contribution to the field of tetraspanins.

      Reply 3.4: From the 8.6 Å cryo-EM data, the amino-acid residues that form the EWI-F Ig6 dimer interface can indeed not be distinguished. However, our data on CD9 in complex with full-length EWI-F (Fig. 3E) and previous cross-linking data (André et al. In situ chemical cross-linking on living cells reveals CD9P-1 cis-oligomer at cell surface - PMID: 19703604) support that EWI-F forms dimeric assemblies. Regarding the concatenation model, we therefore think that it will be of great interest to establish the putative CD9 - CD9 interactions (identified through biochemical approaches), that would link CD9 - EWI-F tetramers into higher assemblies, in the context of native membranes. However, investigating these transient interactions would require various non-trivial experiments and was therefore not within the scope of the current study.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      This paper describes the structure of the tetraspanin CD9 and its interaction with the single pass protein EWI-F. The variability in the D loop of EC2 and the domain swapping is a useful addition to the limited structural database of these proteins and correlates with the relatively poor sequence conservation of this region. The key message is that dimerization of the single pass protein extracellular region, and interaction of its transmembrane helix with the tetraspanin, produces a heterodimeric structure that may further oligomerize. The authors propose a feasible concatenation model for higher order oligomerization of these complexes in the membrane.

      The work is technically well performed and clearly presented including methodological details. I just have a few minor comments:

      Page 4 and Figure S1: it is hard to see how a reliable affinity for 4E8 can be obtained from the cell binding data in S1A, as there is no indication of saturation. It would be good to at acknowledge that this is at best a rough estimate. Fortunately the data for this nanobody in purified situation seems solid.

      Page 6: Does the absence of micellar density for the EWI-F complex indicate flexibility of the extracellular domain relative to the TM? Does this happen because the classification focuses on the highly elongated Ig region?

      Page 8: "Recently, a cryo-EM density map has been reported..." - please reference here.

      Significance

      Relatively little is known about how tetraspanins help to organize partner receptors into defined membrane domains, evidence for which has emerged from super-resolution light microscopy. Based on their structural analysis of the CD9-EWI-F complex, including the heterogeneity apparent in the cryo-EM structure, they propose a feasible concatenation model for higher order oligomerization of these complexes in the membrane. Obviously the model will need to be tested rigorously by mutational analysis, particularly the EWI Ig6 interface, but as it stands the paper is a significant contribution to the field of tetraspanins.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      In this paper, Dr. Oosterheert and colleagues report the crystal structures of CD9EC2 bound to nanobodies 4C8 and 4E8. The CD9EC2/4C8 structure was useful in determining a low resolution cryo-EM structure of EWI-F in complex with CD9/4C8. The observed sample heterogeneity of this ternary complex was reduced by deleting the n-terminal five Ig domains of EWI-F, yielding a modest maximum global resolution of ~ 8.6 Å. The structural approaches used are standard. The crystallographic and structure refinement statistics are sound as are the cryo-EM image processing. The overall cryo-EM structure of the ternary complex shows a central EWI-F protein dimer flanked by one CD9 molecule on each side. The paper is well written and the conclusions made are supported by the data presented.

      Significance

      The ternary structure is in agreement with that of CD9 in complex with the related EWI-2 published earlier this year by Umeda et al (ref #25). The present work thus adds little structural insights but may be useful in showing that the interaction pattern seen extends to another EWI protein family member. As such it may be acceptable for publication. In this case, the authors should improve the quality of Figs. 3D and 4D.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      In this article, the authors provide new insights into the structure of the tetraspanin CD9. On the one hand, they provide crystal structures of the large extracellular domain of CD9, alone or bound to two nanobodies. The 3 structures are similar and similar to that of CD81, a related tetraspanin, except for a portion of the molecule, the so-called D-domain, showing flexibility of this domain. On the other hand, they obtained the cryo-EM structure of CD9 in association with a known-partner (EWI-F) with a resolution of 8.6A. More precisely, the complex of CD9 and the full-length EWI-F showed heterogeneity which they interpret as a consequence of the flexibility between the six Ig-like domains of EWI-F, precluding high-resolution structure determination. However, they showed that CD9 still interacted with a molecule lacking the 5 most membrane-distal Ig domains of EWI-F, and obtained the structure using this construct and an anti-CD9 nanobody. This structure reveals a hetero-tetrameric arrangement of CD9-EWIF, with a central EWI-F dimer flanked by a CD9 molecule on each side. CD9 and EWI-F interact through their transmembrane domains and the two truncated EWI-F molecules through the remaining Ig domains. Importantly, CD9 and EWI-F do not make contacts in the extracellular region, and CD9 shows a semi-open conformation. The structure also shows different configurations of the complex.

      I found this study to be very convincing. Prior studies are referenced appropriately, the text is well written and clear, the figures are clear also.

      In my opinion the paper does not need further experiment.

      The conclusions are well supported by the data. However, the concatenation model seems very speculative at this point. Also, it does not take into account the dynamics of these molecules.

      Minor comment:

      There seems to be a mix up between the two structures in the following sentence p4: "In CD9EC2 - 4C8, the D loop adopts a partially helical conformation and central residue F176 is sandwiched by 4E8 residues W59 of CDR2 and W102 and R105 of CDR3 (Fig. 1D). In the 4C8-bound CD9EC2 structure the tip of the D loop points more outward and the Cα atom of F176"

      Significance

      Tetraspanins have been shown over the years to play an essential role in various biological functions. Among them, CD9 which is strongly expressed on the oocyte plasma membrane is essential for sperm-egg fusion. However, the mechanisms by which CD9 regulates this fusion process as well as other cell-cell fusion events remain unknown. The elucidation of its structure and of how it interacts with well characterized partner proteins is clearly a major advance in our understanding of the function of this molecule.

      The absence of a structure for tetraspanins has been for a long time a knowledge gap. Following a breakthrough in 2001 with the publication of the crystal structure of the large extracellular domain of CD81 (Kitadokoro et al., EMBO J 2001), it was only recently that the structure of a full length tetraspanin, again that of CD81, was published (Zimmermann et al., Cell 2016). Earlier this year was published the crystal structure of a truncated version of CD9 as well as the cryo-EM structure of CD9 in association with another molecular partner EWI-2 (Umeda et al.,Nature com 2020).

      The present structure adds new important information such as the existence of different conformation in the large extracellular domain of CD9 or the structure of CD9 with another molecular partner. It also highlights the different configurations of the complex. It will be of interest to researchers interested in tetraspanins, in membrane organization as well as researchers interested in the biological processes regulated by CD9, notably sperm-egg fusion.

      My field of expertise concerns tetraspanins. I cannot comment on the technical aspects of the structures.

  4. Jul 2020
    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1

      1. The first hypothesis of the manuscript is that, rather than a change in a single immune pathway being responsible for the lack of response to the virus, the response will be systemic involving multiple inter-related pathways. The data show that this was the case after presenting convincing transcriptome analysis.

      We thank the reviewer agreeing that we have convincingly shown that the response to the virus is systemic involving the induction of interrelated pathways

      The second hypothesis is that the differences in responses between bats and humans are due to evolutionarily divergent genes. The authors provide evidence for this in the transcriptome differences in the C-reactive protein, aspects of the complement system, iron regulation and M1/M2 macrophage polarization. The second hypothesis is broad, but there are clearly differences in the genes involved in humans and bats. Without mechanistic information on the function of the proteins/cells investigated, it is hard to determine that the changes the authors are observing are the cause of the different responses, rather than an effect of some upstream response, and so difficult to pin-point specific divergent genes.

      We agree that mechanistic studies will be required to test causal links between the genes we identified and specific anti-viral responses, an effort that is likely to require multiple laboratories and some time. The aim of this study was to enable this effort by identifying a list of candidate genes affected by EBOV and MARV infection in bats, not merely in cultured bat cells.

      The authors wish to compare the response to the virus in bats to the better characterized human tissue responses, but because this relies on previously published work in humans, it is sometimes unclear whether "more bat-like" responses are definitely associated with positive outcomes in humans. As the benefit of certain responses in human infections can depend on the timing of the response, it might be helpful to include summarized human data in manuscript to aid comparison with the bat responses.

      We agree and have added the following data and discussion (inserted into Discussion, page 9, and added two new tables, Tables 2 and 3).

      Comparing our observations to human responses to filoviruses is limited by the scarcity of studies in humans. Nevertheless, this comparison suggests potential directions to explore. In one study, individuals who succumbed to the disease showed stronger upregulation of interferon signaling and acute phase responses compared to survivors during the acute phase of infection[1], consistent with the anti-inflammatory response gene expression signature identified in this study in bats. However, most of the genes used in the study by Liu et al. to classify survivors are either barely expressed in bats or do not respond to filoviral infection (Table 2), the differences that provide potential clues to find why bats can tolerate the infection.

      A study of patients infected with Sudan Ebola virus (SUDV) analyzed protein levels for a panel of genes using a Luminex multiplex assay (using antibodies)[2]. The panel was based on results from other studies and pathways involved in the response to infections. The patients were classified into 3 possible dichotomies (fatal/non-fatal, hemorrhaging/non-hemorrhaging, or high/low viremia) correlated with genes that characterized these states. Most of these genes either are barely expressed, if at all, or are unaffected by infection in bats, except for ferritin (FTL, FTH1) whose expression is lowered by MARV infection, consistent with the observation that ferritin is higher is fatal human cases (Table 3).

      For instance, the T-cell response section concludes "Bats mount a T cell response against the infection" but there is no discussion of the impaired but complex lymphocyte response in humans, so comparison is not possible.

      We have expanded the discussion on T cells (Results, page 7) as follows.

      Previous studies on the adaptive immune response to Ebola and Marburg viruses in humans, non-human primates, and non-primate mammals, shows that long-term immunity is conferred by both T cell and antibody responses. Mostly CD8+ T cells were elicited and helpful against Ebola in mice[3],[4], while SUDV infection in humans[5]) and MARV infection in cynomolgus monkeys[6] and humans[7] ) elicited mostly CD4+ T cells . In most human EBOV infections, CD8+ T cells against the EBOV NP protein dominated the responses, while a minority of individuals harbored memory CD8+ T cells against the EBOV-GP [8].

      Consistent with this, in MARV-infected bats, CD4 expression (specific to CD4+ T cells) was higher, while in EBOV-infected bats, CD8 expression (specific to CD8+ T cells) was higher, the overall levels are low, because the tissue samples are heterogenous and expression of these markers is not high in the T cells to begin with. T cell markers (such as CCL3, ANAX1, TIMD4 and MAGT1) are also upregulated in liver, suggesting a T cell response is mounted.

      Mock infected IHC should be included in Figure 1F to demonstrate the antibodies are not background.

      We have added IHC data of two mock-infected animals (Fig. S1 panels A and B).

      See comment in hypotheses- a summarized table of findings from previous studies of early responses to the virus would be helpful for comparisons to the bat response and for determining the second hypothesis.

      We have expanded our comparisons to previous studies by adding the following text to Introduction (page 3)

      A potential source of the difficulty to understand how bats tolerate or eliminate the viruses that are deadly to humans is the lack of studies that analyze the response to infection in bats rather than in cultured bat cells. The results obtained using cell lines have been contradictory. Some studies claim both EBOV and MARV replicate to similar levels in ERB and human derived cell lines[9], with a robust innate immune response mounted by ERB and to a lesser degree, human cells, while others claim MARV inhibited the antiviral program in ERB cells, like in primate cells, and did not induce almost any IFN gene [10], or little anti-viral gene induction[11]. An experiment with the pig (PK15A) and bat (EhKiT) cells suggested they responded to EBOV through the upregulation of immune, inflammatory, and coagulation pathway, in contrast to a limited response in the human (HEK293T) cells[12]. To comprehensively understand the pathways involved in the bat filoviral response, we infected bats, rather than their isolated cells, and analyzed tissue-specific RNA expression through mRNA-seq in the organs of the infected animals.

      Reviewer #2

      1. The authors provide this contribution to the extremely interesting topic of the immunobiology that facilitates filovirus infections of bats without overt pathology. They focused entirely on gene transcription signatures from different tissue sites following experimental infection, and sometimes compare those signatures with those generated in humans following natural exposures to filoviruses. The strengths of the paper is the shear breadth of data generated that is available openly to the scientific community and the development of novel mRNA datasets from bats, in the absence and presence of infection. One of the major limitations of this systems-based approach is that there is no mechanistic data that links gene function to the immune response to filovirus infection. Rather, associations are made and functional links are inferred. This limitation makes the title of the manuscript "...is controlled by a systemic response" an overstatement.

      We thank the reviewer and agree that mechanistic studies were out of scope of this study and have reflected this fact in the title by replacing “is controlled” with “induces”:

      Ebola and Marburg filovirus infection in bats induces a systemic response

      The authors indicate that one of their main objectives is to understand differences in the responses to infection between bats and humans. But this submission says little about the transcriptome-level responses to filovirus infection in humans. It does, on at least one occasion, state that some of the bat genes with altered expression levels were also altered in a study of human filovirus infections (reference #67). I think it would be helpful if the authors devoted a figure or table to the direct comparison between their analysis of MARV- and EBOV-infected bats and the findings of filovirus-infected humans, highlighting genes that are differentially up- or downregulated between the two species.

      This discussion, which was also requested by Reviewer 1, is now included in the manuscript (Discussion page 9 and Tables 2 and 3).

      Figure 2 is not described nor presented usefully. Instead of providing a figure title ""Upset plot..." the authors should clearly describe the type of transcriptomic data being presented. Moreover, it way the data is plotted does not reveal any direct information about the genes that are up- or downregulated in each condition, thus reducing its utility to the reader. I suggest that this Figure be placed in the Supplemental information. In fact, Figures 3 could also be moved to the Supplemental information

      Figure 2 makes that point that the response is a broad one while Figure 3 presents evidence from expression data that there is tissue-specific responses to the viruses. Both together provide convincing evidence of a systemic, wide-ranging response to both MARV and EBOV infections. We have edited the caption to Figure 2 by changing it to the following:

      Figure 2: Broad response of bat liver genes to filoviral infection. Many genes in the liver respond to filoviral infections, with MARV having a bigger impact compared to EBOV (840 genes that are responsive to MARV alone, compared to the 43 specific to EBOV alone). The EBOV-specific (EBOV/MARV) and MARV-specific (MARV/EBOV)genes are likely host responses specific to the viral VP40, VP35 and VP24 genes. In the plot, mock refers to mock-infected bats, EBOV to EBOV-infected bats, and MARV to MARV-infected bat livers. Each row in the lower panel represents a set, there are six sets of genes based on various comparisons, e.g., EBOV/mock is the set of genes at least 2-fold up regulated in EBOV infection, compared to the mock samples. The gray bars at the lower left representing membership in the sets. The vertical blue lines with bulbs represent set intersections, e.g., the last bar is the set of genes common to EBOV/MARV, EBOV/mock and MARV/mock, so the genes in this set are up 2-fold in EBOV compared to the mock and MARV samples, and at least 2-fold up in MARV compared to mock. The main bar plot (top) is number of genes unique to that intersection, so the total belonging to a set, say mock/EBOV, is a sum of the numbers in all sets that have mock/EBOV as a member (41+203+6+31=281).

      The authors do not specify in the main text, figure captions, or methods sections how they objectively assigned bat homologs as being "similar to " or "divergent from" their human counterparts. What is the cut-off in terms of sequence similarity?

      We apologize for this omission. In addition to a description in Methods, we have added the following statement to the Results section (Page 4).

      To identify divergent genes, we relied on BLASTn[13]. Genes detected as homologues (16004, 87% out of 18443 genes in our databse) using BLASTn default settings were labelled “similar”. The remaining 2439 genes (13%) were considered “divergent”. Of these genes, 1,548 transcripts (8% of the total), could be identified as homologous by reducing the word-size in BLASTn from 11, the default, to 9. This approach is equivalent to matching at the protein level, but we find that using nucleotide level matches provides a cleaner separation of the two classes than using translated proteins (Fig. 4, Methods).

      In the Discussion, it is surprising that the authors state that "the majority of interferon response genes are not divergent from human homologs" since genes involved in innate immunity are some of the most rapidly evolving genes known to exist. Again, clarification over what dictates "divergence" over "similarity" is warranted. Many previous studies have shown how a single residue change in an innate immune effector can drastically alter its specificity and/or potency.

      We have clarified this point by adding the following statement in the Discussion (pages 8,9)

      There are hundreds of genes involved in the interferon response, some key components can mutate to change specificity of their interactions, but most, especially those in the core ISG category[14], evolve slowly and have conserved function and sequence[15]. Our analysis of gene divergence shows that the majority of interferon response genes are not divergent from their human homologs, consistent with prior observations that the innate responses are quite similar between human and bat cell lines[9]. This implies that other systems are involved in generating the difference in response between bats and humans.

      The authors state in the introduction, and point to citation #21, that ERBs are "refractory to infection." In Figure 1, the authors indicate that experimental of ERBs with EBOV led to detectable infection in some animals, particularly in the liver. At this point in the manuscript, the authors should state if and how this result differs from what is published in #21, and they should comment on whether this is scientifically significant, or not. This is eventually discussed briefly in the Discussion but adding a sentence to Results section would be helpful for readers.

      To emphasize that our results contradict prior reports of ERB being refractory to EBOV infection, we have modified the statement in the Results (page 3) as follows.

      Two of the three EBOV-inoculated animals presented with histopathological lesions in the liver, consisting of pigmented and unpigmented infiltrates of aggregated mononuclear cells compressing adjacent tissue structures, and eosinophilic nuclear and cytoplasmic inclusions, changes consistent with previous reports[16], [17]. In EBOV-infected animals, focal immunostaining with both pan-filovirus and EBOV-VP40 antibodies was observed in the liver of one animal, but very few foci were found, suggesting limited viral replication.

      The research question at hand, concerning how bats serve as reservoirs for multiple viruses which are pathogenic to humans without succumbing to disease, is one of the hottest topics in immunology and virology. However, the authors do not provide a clear enough explanation of how their approach to study the transcriptome response following filovirus infection goes beyond what has been published in previous studies. This manuscript would greatly benefit from a discussion of its novelty in the Introduction and Discussion sections.

      We have reviewed prior human and bat studies (Introduction -page 3 and Discussion- page 9 shown above) to highlight the novelty of our findings. We have also added the following sentence at the end of the Introduction highlighting the novelty of the study.

      This is the first in vivo study that focuses on the coordinated transcriptional response to filoviruses at the level of individual organs in bats.

      References

      [1] X. Liu et al., “Transcriptomic signatures differentiate survival from fatal outcomes in humans infected with Ebola virus,” Genome Biology, vol. 18, no. 1, p. 4, Jan. 2017, doi: 10.1186/s13059-016-1137-3.

      [2] A. K. McElroy et al., “Ebola hemorrhagic Fever: novel biomarker correlates of clinical outcome,” J. Infect. Dis., vol. 210, no. 4, pp. 558–566, Aug. 2014, doi: 10.1093/infdis/jiu088.

      [3] S. B. Bradfute, K. L. Warfield, and S. Bavari, “Functional CD8+ T cell responses in lethal Ebola virus infection,” J. Immunol., vol. 180, no. 6, pp. 4058–4066, Mar. 2008, doi: 10.4049/jimmunol.180.6.4058.

      [4] M. N. Rahim et al., “Complete protection of the BALB/c and C57BL/6J mice against Ebola and Marburg virus lethal challenges by pan-filovirus T-cell epigraph vaccine,” PLOS Pathogens, vol. 15, no. 2, p. e1007564, Feb. 2019, doi: 10.1371/journal.ppat.1007564.

      [5] A. Sobarzo et al., “Multiple viral proteins and immune response pathways act to generate robust long-term immunity in Sudan virus survivors,” EBioMedicine, vol. 46, pp. 215–226, Aug. 2019, doi: 10.1016/j.ebiom.2019.07.021.

      [6] L. Fernando et al., “Immune Response to Marburg Virus Angola Infection in Nonhuman Primates,” J Infect Dis, vol. 212, no. suppl_2, pp. S234–S241, Oct. 2015, doi: 10.1093/infdis/jiv095.

      [7] S. W. Stonier et al., “Marburg virus survivor immune responses are Th1 skewed with limited neutralizing antibody responses,” J. Exp. Med., vol. 214, no. 9, pp. 2563–2572, Sep. 2017, doi: 10.1084/jem.20170161.

      [8] S. Sakabe et al., “Analysis of CD8+ T cell response during the 2013–2016 Ebola epidemic in West Africa,” PNAS, vol. 115, no. 32, pp. E7578–E7586, Aug. 2018, doi: 10.1073/pnas.1806200115.

      [9] I. V. Kuzmin et al., “Innate Immune Responses of Bat and Human Cells to Filoviruses: Commonalities and Distinctions,” J. Virol., vol. 91, no. 8, Apr. 2017, doi: 10.1128/JVI.02471-16.

      [10] C. E. Arnold et al., “Transcriptomics Reveal Antiviral Gene Induction in the Egyptian Rousette Bat Is Antagonized In Vitro by Marburg Virus Infection,” Viruses, vol. 10, no. 11, 02 2018, doi: 10.3390/v10110607.

      [11] M. Hölzer et al., “Differential transcriptional responses to Ebola and Marburg virus infection in bat and human cells,” Scientific Reports, vol. 6, p. 34589, Oct. 2016, doi: 10.1038/srep34589.

      [12] J. W. Wynne et al., “Comparative Transcriptomics Highlights the Role of the Activator Protein 1 Transcription Factor in the Host Response to Ebolavirus,” Journal of Virology, vol. 91, no. 23, Dec. 2017, doi: 10.1128/JVI.01174-17.

      [13] S. F. Altschul, W. Gish, W. Miller, E. W. Myers, and D. J. Lipman, “Basic local alignment search tool,” J. Mol. Biol., vol. 215, no. 3, pp. 403–410, Oct. 1990, doi: 10.1016/S0022-2836(05)80360-2.

      [14] A. E. Shaw et al., “Fundamental properties of the mammalian innate immune system revealed by multispecies comparison of type I interferon responses,” PLOS Biology, vol. 15, no. 12, p. e2004086, Dec. 2017, doi: 10.1371/journal.pbio.2004086.

      [15] T. B. Sackton, B. P. Lazzaro, T. A. Schlenke, J. D. Evans, D. Hultmark, and A. G. Clark, “Dynamic evolution of the innate immune system in Drosophila,” Nat. Genet., vol. 39, no. 12, pp. 1461–1468, Dec. 2007, doi: 10.1038/ng.2007.60.

      [16] M. E. B. Jones et al., “Experimental Inoculation of Egyptian Rousette Bats (Rousettus aegyptiacus) with Viruses of the Ebolavirus and Marburgvirus Genera,” Viruses, vol. 7, no. 7, pp. 3420–3442, Jun. 2015, doi: 10.3390/v7072779.

      [17] J. T. Paweska, N. Storm, A. A. Grobbelaar, W. Markotter, A. Kemp, and P. Jansen van Vuren, “Experimental Inoculation of Egyptian Fruit Bats (Rousettus aegyptiacus) with Ebola Virus,” Viruses, vol. 8, no. 2, Jan. 2016, doi: 10.3390/v8020029.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      The authors provide this contribution to the extremely interesting topic of the immunobiology that facilitates filovirus infections of bats without overt pathology. They focused entirely on gene transcription signatures from different tissue sites following experimental infection, and sometimes compare those signatures with those generated in humans following natural exposures to filoviruses. The strengths of the paper is the shear breadth of data generated that is available openly to the scientific community and the development of novel mRNA datasets from bats, in the absence and presence of infection. One of the major limitations of this systems-based approach is that there is no mechanistic data that links gene function to the immune response to filovirus infection. Rather, associations are made and functional links are inferred. This limitation makes the title of the manuscript "...is controlled by a systemic response" an overstatement.

      Major points:

      The authors indicate that one of their main objectives is to understand differences in the responses to infection between bats and humans. But this submission says little about the transcriptome-level responses to filovirus infection in humans. It does, on at least one occasion, state that some of the bat genes with altered expression levels were also altered in a study of human filovirus infections (reference #67). I think it would be helpful if the authors devoted a figure or table to the direct comparison between their analysis of MARV- and EBOV-infected bats and the findings of filovirus-infected humans, highlighting genes that are differentially up- or downregulated between the two species.

      Figure 2 is not described nor presented usefully. Instead of providing a figure title ""Upset plot..." the authors should clearly describe the type of transcriptomic data being presented. Moreover, it way the data is plotted does not reveal any direct information about the genes that are up- or downregulated in each condition, thus reducing its utility to the reader. I suggest that this Figure be placed in the Supplemental information. In fact, Figures 3 could also be moved to the Supplemental information.

      The authors do not specify in the main text, figure captions, or methods sections how they objectively assigned bat homologs as being "similar to " or "divergent from" their human counterparts. What is the cut-off in terms of sequence similarity?

      In the Discussion, it is surprising that the authors state that "the majority of interferon response genes are not divergent from human homologs" since genes involved in innate immunity are some of the most rapidly evolving genes known to exist. Again, clarification over what dictates "divergence" over "similarity" is warranted. Many previous studies have shown how a single residue change in an innate immune effector can drastically alter its specificity and/or potency.

      Minor points:

      The authors state in the introduction, and point to citation #21, that ERBs are "refractory to infection." In Figure 1, the authors indicate that experimental of ERBs with EBOV led to detectable infection in some animals, particularly in the liver. At this point in the manuscript, the authors should state if and how this result differs from what is published in #21, and they should comment on whether this is scientifically significant, or not. This is eventually discussed briefly in the Discussion but adding a sentence to Results section would be helpful for readers.

      Significance

      The research question at hand, concerning how bats serve as reservoirs for multiple viruses which are pathogenic to humans without succumbing to disease, is one of the hottest topics in immunology and virology. However, the authors do not provide a clear enough explanation of how their approach to study the transcriptome response following filovirus infection goes beyond what has been published in previous studies. This manuscript would greatly benefit from a discussion of its novelty in the Introduction and Discussion sections.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary:

      The manuscript by Jayaprakash et al investigates the response to the filoviruses Marburg and Ebola virus in Rousettus aegyptiacus bats, the natural reservoir of Marburg virus. The response to infection is investigated by comparing transcriptomes of different bat tissues in infected and uninfected bats. The manuscript groups the observed transcriptome changes into pathways that are impacted, and discusses how those pathways may cause subclinical infection in bats, compared to severe disease in humans. The data included also sheds light on bat immunology and reservoir characteristics more generally, which is particularly timely during the SARS-CoV-2 pandemic.

      Major comments:

      Are the key conclusions convincing?

      The first hypothesis of the manuscript is that, rather than a change in a single immune pathway being responsible for the lack of response to the virus, the response will be systemic involving multiple inter-related pathways. The data show that this was the case after presenting convincing transcriptome analysis. The second hypothesis is that the differences in responses between bats and humans are due to evolutionarily divergent genes. The authors provide evidence for this in the transcriptome differences in the C-reactive protein, aspects of the complement system, iron regulation and M1/M2 macrophage polarization. The second hypothesis is broad, but there are clearly differences in the genes involved in humans and bats. Without mechanistic information on the function of the proteins/cells investigated, it is hard to determine that the changes the authors are observing are the cause of the different responses, rather than an effect of some upstream response, and so difficult to pin-point specific divergent genes. The authors wish to compare the response to the virus in bats to the better characterized human tissue responses, but because this relies on previously published work in humans, it is sometimes unclear whether "more bat-like" responses are definitely associated with positive outcomes in humans. As the benefit of certain responses in human infections can depend on the timing of the response, it might be helpful to include summarized human data in manuscript to aid comparison with the bat responses. For instance, the T-cell response section concludes "Bats mount a T cell response against the infection" but there is no discussion of the impaired but complex lymphocyte response in humans, so comparison is not possible.

      Should the authors qualify some of their claims as preliminary or speculative, or remove them altogether?

      No, speculative discussion of potential drugs is already qualified as speculative, and adds to the understanding of the significance of the data.

      Would additional experiments be essential to support the claims of the paper? Request additional experiments only where necessary for the paper as it is, and do not ask authors to open new lines of experimentation.

      No

      Are the suggested experiments realistic in terms of time and resources? It would help if you could add an estimated cost and time investment for substantial experiments.

      N/A

      Are the data and the methods presented in such a way that they can be reproduced?

      Yes

      Are the experiments adequately replicated and statistical analysis adequate?

      Yes

      Minor comments:

      Specific experimental issues that are easily addressable.

      Mock infected IHC should be included in Figure 1F to demonstrate the antibodies are not background.

      Are prior studies referenced appropriately?

      Mostly yes. The discussion of the T-cell responses in infection could be expanded to include more information on human responses

      Are the text and figures clear and accurate?

      Yes

      Do you have suggestions that would help the authors improve the presentation of their data and conclusions?

      See comment in hypotheses- a summarized table of findings from previous studies of early responses to the virus would be helpful for comparisons to the bat response and for determining the second hypothesis.

      Significance

      Nature and Significance of the advance.

      Bat immune responses to filoviruses are poorly characterized, and this paper contains much information that can aid future investigation of reservoir responses. This data also has broad application to other bat-borne pathogens.

      Compare to existing published knowledge.

      There is little about in vivo bat immune response to filoviral infections. Significantly, this report has a non-refractory response to Ebola virus infection in Rousettus aegyptiacus.

      Audience

      This paper would be of interest to filovirologists and those interested in zoonotics and bat immunology.

      Your expertise.

      I am a viral immunologist with >15 years' experience with filoviruses. Ms. Clarke is a senior graduate student whose thesis focuses on immune responses to filovirus glycoproteins.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      INITIAL RESPONSE TO REVIEWERS / REVISION PLAN

      We are grateful to the three reviewers for reviewing our manuscript and providing their comments which helped to improve further the quality of the current study. We attach an initial revised version of the manuscript with changes corresponding to reviewers’ comments being highlighted. We now provide:

      • 18 new main figure panels (Fig.1E, Figs.2D-F, Figs.3E-F, Figs.4B,C,E, Figs.6B-F, Figs.7B,D,E,F),
      • 9 new supplementary figures, and
      • 13 new supplementary tables, that correspond to the points raised by the reviewers. In this initial response to reviewers and revision plan we have already performed the bioinformatics analysis and the majority of new wet lab experiments requested by the reviewers, while we are still awaiting only for the results of three sets of wet lab experiments (RIP-seq, additional protein/RT-qPCR confirmations and B2 incubations with other proteins), which, due to their nature, take longer. We have also revised the main text accordingly with only a number of updates (regarding some methods of experiments currently in progress and the respective discussion) still missing.

      In detail:

      REVIEWER 1

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      B2 RNAs, encoded from SINE B2 elements has been directly implicated in stress response by its inherent ability to bind RNA Pol II and suppress stress response genes (SRG) in homeostatic conditions. However, upon stimuli, B2 RNAs are cleaved and degraded, resulting in the release of RNA pol II and upregulation of SRGs. Previous work from the senior author identified PRC2 component EZH2 to be the B2 RNA processing factor, cleaving B2, and releasing POL2. SRGs are upregulated upon stress, for example in age-associated neuropathologies like Alzheimer's disease (AD). Considering that the hippocampus is a primary target of amyloid pathologies as well as since SRGs are suggested to be key for the function of a healthy hippocampus, the authors set to understand the role of B2 RNAs that are linked to SRG regulation in the mouse hippocampus with amyloid pathology. They use disease-relevant in vivo and in vitro models combined with unbiased RNA seq data analysis for this endeavor, which indicates the potential relevance of B2 RNAs in APP mediated neuronal pathologies in mice as well as identifies Hsf1 as the factor cleaving B2 RNAs in the hippocampus.

      This reviewer generally remarks that “The work is interesting and identification of Hsf1 as the processing factor for B2 RNAs in the hippocampus is significant. I would like to credit the authors for their elegant in vivo experimental design in Figure 2.”

      We appreciate the encouraging comments made by this reviewer.

      General comment: The reviewer finds “some of the conclusions to be overstated” and has brought a number of concerns to our attention. Indeed, we agree that provision of additional data and details is needed to avoid any confusion about the gene pathways to which our findings apply. In the initial manuscript, (Figures 2 D, F and 6 D, F), we presented the gene expression levels of all B2 RNA regulated SRGs identified in our previous study (Zovoilis et al, Cell 2016), referred as B2 RNA regulated SRGs or B2-SRGs throughout the manuscript. To this end, we performed the respective statistical tests between the different conditions considering these genes, in order to show the transcription dynamics of these genes in either amyloid beta pathology (APP mice /Figs. 2D, F) or amyloid beta toxicity (HT22 cells / Figs. 6D, F). Since we were not looking for new candidate genes upregulated in APP mice or in our HT22 cell culture system, we did not narrow our analysis only to genes delivered by a general-purpose differential gene expression approach such as DESeq but tested all B2-SRGs. However, based on the reviewer’s comments below, we realize that the paper would benefit by presenting in the main figures only those B2 RNA regulated SRGs that overlap with differentially expressed genes identified by DEseq in each experimental system. This will help to avoid confusion and any misunderstanding that all B2 RNA regulated genes are equally affected in our system, which is not the case and would be an overstatement. We are now presenting in new Figure 2 (2E, 2F) only those B2-SRGs that overlap with upregulated genes identified by DESeq in 6m old APP mice (listed in new Suppl. Table 5) and in new Figure 7 (7D, F) we are now presenting only those B2-SRGs that overlap with upregulated genes identified by DESeq in HT22 cells treated with amyloid beta (listed in new Suppl. Table 11). The conclusions drawn by the new figures remain the same as with the old ones and we believe that this new way of presentation of this data will prevent confusion and potential over-statements. We thank the reviewer for bringing this to our attention. Based also on this reviewer’s minor point 3, we recommend that the old figures that included all B2-SRGs (and not only the differentially expressed ones identified by DESeq) are moved to the Supplement as new Supplementary Figures 1 and 7, respectively, so that readers can still get a view of all the data and the transcription dynamics of all B2-SRGs, while we provide both in text and the supplement an explanation about the value as well as limitations of these figures.

      **Major comments:**

      Major point 1. The reviewer asks: “In figure 1, the authors indicate a strong connection between B2 RNA regulated SRGs and learning and memory. In figure 2, they identify the SRGs in the hippocampus, please provide a direct comparison of learning and memory associated SRGs and the SRGs they identify in figure 2 that are significantly upregulated in APP mice in 6 months.”

      In the revised version of the manuscript we now provide: i) As a new figure panel (lower panel in new Fig.1E), the number of B2 RNA regulated SRGs that are associated with learning based on our Peleg et al, Science 2010 paper and as a new Supplementary Table 3, the exact list of these genes. ii) As a new Supplementary Table 4, the list of all genes that are significantly upregulated in APP mice (6 months). iii) As a new Supplementary Table 5, the list of those genes upregulated in amyloid pathology (APP 6 months) that are B2-SRGs (expression levels of these genes are presented in new Figure 2E,F). Per reviewer’s question, we now provide as a new Supplementary Table 6, the list of B2 RNA regulated SRGs that are both learning associated genes and upregulated in 6 month old APP mice. In the text (first two sections of the results), we provide direct comparisons of the number of genes in each category and their overlap.

      Major point 2. The reviewer asks: “To better understand the data in the context of hippocampal function, please include functional annotation of SRGs they identified in Figure 2F as they do it in Figure 1 (desirably for each time point, at least for 6M). How many of the SRGs they identify in Figure 1 are part of Figure 2F? Please include functional annotation of significantly upregulated B2 regulated SRGs in Fig2 and compare them with that of Figure 1.”

      The number of B2 RNA regulated SRGs in Figure 1 that are part of Figure 2 (in particular Figs.2E,F) is now presented in the new Supplementary Table 5 and also in the text. We now provide as a new Supplementary Table 7 the functional annotation of these genes (see also general comment for this reviewer) and discuss the findings in the text.

      We recommend to include only the 6M old mice as this is the time point in which B2 RNA processing was found to differ between WT and APP mice. However, if the reviewer thinks that this is necessary we will add also differential expression lists of other ages as additional supplementary tables.

      Major point 3. The reviewer asks: “In figure 3, the authors report that the B2 processing rates are high at the 6M time point at in hippocampi of the APP mice. Please include the levels of unprocessed and processed B2 RNAs in these samples along with this figure, without which it is difficult to gauge the significance of its correlation with SRGs in Figure 2.”

      We now provide as new figure panels 3E and 3F the levels of processed B2 RNA fragments and unprocessed (full length) B2 RNAs in these samples, respectively, along with the processing ratio which is now labeled as subfigure 3G.

      Major point 4. The reviewer asks: “What is the % of B2 regulated SRGs that are hsf1 bound in Figure 4C? What is there dynamics in the wild type and APP hippocampi?”.

      Old Figure 4C is now Figure 4A. The exact number of B2 RNA regulated SRGs that are close to Hsf1 binding sites is now presented as a new figure (Figure 4C) and discussed in the text. A list of these genes is provided as new Supplementary Table 8. For genes that are upregulated in APP mice compared to wild type, the difference in Hsf1 binding dynamics between B2 RNA regulated and not regulated genes is now presented as Suppl. Figure 4D.

      Major point 5. The reviewer asks: “What is the distribution of Hsf1 binding sites on (a) non-B2 regulated SRGs and (b) non-SRG genes in hippocampi?”.

      This point is related with point 4. We now present a new panel (Fig. 4B) for non B2 RNA regulated genes (listed in Suppl. Table 13) along with the distribution we have in the initial manuscript for all B2 RNA regulated SRGs (now presented as Fig. 4A). The direct comparison of these genes is presented in the new Suppl Figure 4C together with a similar comparison only for genes upregulated in APP mice (Suppl. Fig.4D)

      Major point 6. The reviewer notes: “In Figure 4D, the 3months old Wt HSF1 levels are high, yet B2 processing (Figure 3E) is low. Please comment.”

      The reviewer’s comment made us realize that we should include a plot that describes the correlation between Hsf1 levels and B2 RNA processing ration across all sequenced samples. This should reveal whether differences such as those observed by the reviewer affect our conclusion regarding the relationship between these two parameters. We now provide this in the new Supplementary Figure 6D, where we found a strong positive correlation between Hsf1 levels and B2 RNA processing ratio. We thank the reviewer for this comment which helped us to substantiate further this relationship.

      Major point 7. The reviewer notes: While the authors show in vitro cleavage of B2 RNA by Hsf1, the experiment lacks controls to be conclusive. At least, please include a similar size protein as HSF1 with no-known RNA binding activity and a similar size protein with RNA binding activity as controls in 5A. Please justify the use of PNK as the control protein. Please include the use domain-based deletions of Hsf1 to map the region of HSF1 that is binding and potentially cleaving the B2 RNA. Please include an RNA of similar size and Antisense-B2 RNA to show the specificity of the Hsf1 based cleavage of B2 RNA. Without these controls, the conclusions in Figure 5 cannot be substantiated.

      The endogenous ribozyme activity of B2 RNA compared to other control RNAs has already been shown in two previous works but we will also include the relative controls here by providing control incubations with other RNAs. We will also include the incubations with additional control proteins as suggested by the reviewer. We are currently performing these experiments and will include them in the revised version. PNK is used as a control protein because it is an RNA binding protein that is used in the construction of our short RNA libraries and we wanted show that short RNA seq data are free of such confounding factors that could potentially generate artificial fragments. We now include this information in the text.

      We feel that the application of domain based deletions for Hsf1, while it would add additional information on the exact biochemistry underlying B2 RNA processing though Hsf1, is beyond the scope of this manuscript. In the current manuscript we are just focusing on the fact that Hsf1 can accelerate B2 RNA processing in vitro and not on the mechanism how this happens. This should be addressed in our opinion on a separate manuscript.

      Major point 8. The reviewer asks: “The authors should show that the incubated APP peptides are taken up by the cells (experiments in Figure 5F and Figure 6).” These figures are now labelled as Fig.6C and Figure 7, respectively. That’s a very interesting point and we thank the reviewer for this comment. Multiple studies have shown that toxicity after incubation by amyloid beta is mediated mainly by cell surface receptors, which through cell signalling leads to the response to cellular toxicity that induces stress genes such as Hsf1. Nevertheless, APP peptides may enter the cell, and the reviewer’s questions raised the possibility that oligomers entering the cell could have a direct impact on the stability of the B2 RNA. In that case, providing evidence that the amyloid enters the cell would be important if we had indications that amyloid beta interacts directly with B2 RNA. We did test this and we found no direct effect of amyloid beta on B2 RNA, so the processing in our case is not induced by oligomers that may have entered the cell. We were planning to present this information in a different manuscript, but if the reviewer or editor thinks that it would be beneficial for the paper, we could present this as supplement figure that shows that amyloid beta incubations with B2 RNA do not induce further processing beyond what Hsf1 causes. For the moment we just present this below:

      Major point 9. The reviewer asks: “Please provide the list, functional annotation, and % of the SRGs upregulated upon incubation with APP in HT22 cells in comparison to 6month old APP mice. Comment on learning-related Genes.”

      In the revised version, we now provide and mention in the text the following data: i) a list of genes upregulated in HT22 cells during amyloid toxicity upon incubation with amyloid beta (new Suppl. Table 9), ii) a list of genes according to point (i) that are common with genes upregulated in APP mice (new Suppl. Table 10), iii) the list and number of B2-SRGs that are upregulated in HT22 cells during amyloid toxicity (the reviewer’s question) (new Suppl. Table 10). We mention in the text the gene numbers and also the genes that are common in all three lists. iv) Functional annotation of genes of point (iii) (new Suppl. Table 12),

      We also mention in the text the limitations of our comparisons between the in vivo model of amyloid pathology (APP mice) and the in vitro cell culture model of amyloid toxicity (HT 22 cells) and we clarify that the cell culture model is used just as a simulation of the effect of amyloid beta in gene pathways associated with response to cellular stress and the role of Hsf1 on B2 RNA processing.

      Major point 10. The reviewer asks: “The authors should show the efficient downregulation of Hsf1 (protein) upon anti-Hsf1 LNA transfection.”

      In the revised version, in addition to the RNA-seq data we provide a second confirmation at the mRNA level with an independent method (RT-qPCR) in new figures 4E and 7B (lower panel). We are currently performing the protein extractions and will provide a WB or an Elisa in the revised version.

      Major point 11. The reviewer asks: “Please present the total B2 RNA levels for conditions in Figure 6C.”

      We now provide as new supplementary figure (Suppl. Fig. 6B and C) the levels of processed B2 RNA fragments and the total levels of unprocessed full length B2 RNAs of these samples that relate to old Figure 6C (now labeled as Fig.7C)

      Major point 12. The reviewer notes: “Hsf1 levels are not significantly downregulated in Control cells which were inoculated with the reverse APP peptide. Please comment.”

      We assume that the reviewer here refers to the lack of reduction in Hsf1 levels in the cells inoculated with the reverse peptide and the anti-Hsf1 LNA. Indeed, this lack of reduction is confirmed also by the new qPCR we performed (new Figure 7B, lower panel, R-ctrl vs R-anti-Hsf1). This should likely be attributed to compensation during non-stress conditions. In contrast, under stress conditions, Hsf1 is heavily used in stress response, which could explain the differences we see as cellular needs surpass the available Hsf1 transcripts due to degradation by the LNA. This is also supported by the new RT-qPCR experiments we have performed for B2-SRGs (new Figure 7E). In agreement with what is known for stress response genes such as immediately early genes (for example FosB), levels of these genes are minimal in both R-ctrl and R-anti-Hsf1 conditions and only become activated during stress response. We now discuss this in the text of the revised manuscript.

      Major point 13. The reviewer asks: “Please compare and contrast the % of genes, the overlap, and the functional distinctions in 6F to that of 5G and Figure1. What are the genes that are common between Figure1, and that are specifically upregulated upon Anti-Hsf1 LNA transfection along with 1-42 APP. What is % of the occurrence of B2 binding sites in those genes? What are their functional annotations and what is their connection to learning, memory, and cell survival?”

      Old Figure 6F is now Figure 7F, while old Figure 5G is now Figure 6C. This point is discussed in the response to points 1 and 9 of this reviewer. In summary, genes upregulated in our amyloid toxicity model included 25 B2-SRGs (new Suppl. Table 11). When testing for enriched terms in these 25 genes, biological processes related with apoptosis, such as regulation of apoptotic process and programmed cell death were at the top of the list (new Suppl. Table 12) and included, among others, genes such as FosB and Mitf that have been connected with Alzheimer’s disease. Out of the 25 genes that are up-regulated in both mice and our cell culture system, six are B2-SRGs (4932438A13Rik, Fosb, Pag1, Ptprs, Sema5a, and Sgms1) and include a well-known immediate early gene (Fosb), genes associated with sensitivity to amyloid toxicity (Pag1, Sema5a, Sgms1, Fosb), as well as genes associated with p53 (Ptprs, Fosb). All these genes get upregulated in amyloid toxicity (42-Ctrl vs R-Ctrl) but are not upregulated when Hsf1 LNA is applied (42-anti-Hsf1 vs R-anti-Hsf1, no significant difference). This information is now included in the text.

      **Minor.**

      1 . Please include TPM/ FPKM values for hippocampal markers as control in Figure 2 to do justice to the hippocampus specific RNA seq conducted by the Authors.

      To our understanding, the reviewer here suggests the testing of well-known hippocampal markers in our mouse data as controls to confirm that they are indeed hippocampus specific. We have selected as reference markers, the genes employed by the Allen Brain Atlas RNA-sequencing project and we provide a comparison of their data in hippocampal cells with our data from mouse hippocampus. This is now presented as new Supplementary Figure 2.

      2 . In figure 2D the authors show that B2 RNA regulated SRGs in the 3 months' wild type mice are significantly high. P53 has been reported to be high in young wild types hippocampus, but not SRGs in my opinion. The authors should comment on this.

      Old Figure 2D is now Figure 2E. We now mention the reviewer’s comment particularly in the discussion and cite a landmark review article in Neuron journal by Michael Greenberg regarding the role of stress response genes, such as FosB, early during development. As to prevent any confusion, we have also replaced SRGs with B2-SRGs since we tested only B2-SRGS in our study.

      3 . In figure 2F, under the 6m APP condition, the replicate 3 looks substantially different from the other replicate. This can significantly impact the analysis and conclusions made. Either remove that replicate and present the analysis without it or please provide a valid explanation. To make the data more valid, please provide hierarchical clustering of the entire data, the non-B2 regulated genes and the B2 regulated SRGs.

      We now provide in the new Supplementary Figure 9C a PCA plot, which includes 6m APP mice vs. their WT counterparts and HT22 cells, and shows that this variability is within the biological replicate variability we can expect in these models. To substantiate this further, we have constructed the correlation matrix of the RNA-seq data of both WT and APP 6 month old mice in the new Supplementary Figure 9D. As shown in this matrix, all APP mice clearly correlate with each other and not with their WT counterparts.

      In the initial manuscript the heatmaps of former Figure 2 were indeed provided with hierarchical clustering of the entire data and also included non-B2 RNA regulated genes. This data is included now as Supplementary figure 2.

      In Figure 2C RNA seq data is represented in TPM while its FPKM in Figure 2D.

      Figure 2D is now Figure 2E, while Figure 2C remains labelled with the same number. Given that TPM already includes scaling of the data, it is unsuitable for the averaging of the gene expression levels of multiple genes (B2-SRGs) used in the boxplots of Figure 2. This does not apply in the case of single genes as in Fig 2C (p53) or in the heatmap where each gene is presented in a separate row. This explanation is now included in the methods section.

      Figure 2: the number of replicates in the case of 3-month-old wild types only 2. Please specifically denote it and comment why only 2 replicates are provided.

      During the hippocampal RNA extractions, the RNA of one of the three 3m old mice had very low RIN scores, which could be a confounding factor for the short-RNA-seq. As this happened some months after the hippocampal extractions, we did not have any other 3 month mice of the same cohort used for the behavioral and IHC studies. Thus, we decided to include only two replicates in this condition. Since the results presented in the current study focus mainly on 6 month old mice, we expect the impact to be minimal. We include this note in the methods section.

      4 . Considering that p53 and SRGs are significantly upregulated in 6months in the APP model, it would be great if (allowing that these samples are still available) the authors can include a staining for apoptotic markers, for example, Active Casp3 or similar. This will allow us to better gauge the gene expression changes presented by the authors especially regarding SRGs.

      Unfortunately, we do not have these slides but in the revised version we will provide qPCR data for some of these markers.

      5 . Under subheading: Hsf1 accelerates B2 RNA processing, 3rd paragraph when the authors comment on known hsf1 binding sites on SRG genes, please correct from: Increased Hsf1-binding was found.... "To the increased number of hsf1 binding sites were found", unless the authors would like to show increased Hsf1 binding by performing CHIP-seq for Hsf1 in the hippocampus at least at the 6-month time point between Wt and APP mice.

      We have changed the text accordingly.

      Reviewer #1 (Significance (Required)):

      B2 RNAs, encoded from SINE B2 elements has been directly implicated in stress response by its inherent ability to bind RNA Pol II and suppress stress response genes (SRG) in homeostatic conditions. However, upon stimuli, B2 RNAs are cleaved and degraded, resulting in the release of RNA pol II and upregulation of SRGs. Previous work from the senior author identified PRC2 component EZH2 to be the B2 RNA processing factor, cleaving B2, and releasing POL2. SRGs are upregulated upon stress, for example in age-associated neuropathologies like Alzheimer's disease (AD). Considering that the hippocampus is a primary target of amyloid pathologies as well as since SRGs are suggested to be key for the function of a healthy hippocampus, the authors set to understand the role of B2 RNAs that are linked to SRG regulation in the mouse hippocampus with amyloid pathology. They use disease-relevant in vivo and in vitro models combined with unbiased RNA seq data analysis for this endeavor, which indicates the potential relevance of B2 RNAs in APP mediated neuronal pathologies in mice as well as identifies Hsf1 as the factor cleaving B2 RNAs in the hippocampus.

      The work is interesting and identification of Hsf1 as the processing factor for B2 RNAs in the hippocampus is significant. I would like to credit the authors for their elegant in vivo experimental design in Figure 2.

      REVIEWER 2

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      **Summary:**

      This manuscript follows from previous work by the corresponding author showing that SINE-encoded B2 RNAs function as regulators of the expression of stress response genes (SRGs). Specifically, stimulus triggers the processing of repressive B2 RNAs that are bound at the SRGs, thereby activating SRG transcription. In this work, the authors investigate whether a similar mechanism might be controlling the expression of genes in models of amyloid beta neuropathology (i.e. mouse hippocampi from an amyloid precursor protein knock-in mouse model, and a cell culture model of amyloid beta toxicity). They performed RNA-seq in these models. Their data show a correlation between the progression of amyloid pathology, expression of genes thought to be regulated by B2 RNA, and the processing of B2 RNA. In addition, they show biochemical data supporting a role for Hsf1 in enhancing the processing of B2 RNA. Knockdown of Hsf1 also reduced B2 RNA processing and the expression of SRGs.

      **Major comments:**

      Major point 1. The reviewer asks: “In the RNA-seq data one cannot distinguish between Pol III transcribed B2 RNA and Pol II transcribed B2 RNA (typically embedded within introns and UTRs of mRNAs). The models they present, and the structures they show, clearly imply regulation by Pol III transcribed B2 RNA. However, there is no way to know that the short B2 RNAs they sequence aren't coming from degraded mRNAs. This needs to addressed. Minimally, in writing as a caveat of their model. Ideally, it would be addressed experimentally.”

      That’s a very interesting point, as it implies that the regulatory role of B2 RNAs may extend from PolIII transcribed B2 RNAs into B2 RNAs embedded into mRNAs (likely nascent ones) that may be also under the same endogenous ribozyme activity of this sequence, suppress PolII and are processed in response to stimuli. The RNA RIN values of our samples were pretty high except one 3m old mouse sample which was for this reason excluded from further analysis. Moreover, during the library construction shorter and longer RNAs have been separated. Thus, any generation of B2 RNA fragment that may have originated from mRNA should be biologically but not technically related and must have happened in the cell before our RNA extraction. To address this point, we now provide a new supplementary figure (Suppl. Figure 8), where we have separated the B2 elements against which we map the RNA fragments into two categories, those that fall within exonic/genic regions and those outside of these regions. Although B2 RNAs are produced by multiple copies in the genome, each copy does harbor multiple SNPs, insertions and deletions, which means that each B2 RNA fragment is mapped to a specific set of B2 elements and not to all of them. In other words, despite multiple mapping a level of spatial specificity is maintained. If the B2 RNAs we map were coming exclusively from either only Pol III B2 elements or mRNA embedded B2 elements, we would expect at least some difference in the distribution of fragments between B2 elements of these two categories, as the second one overlaps with mRNAs. As shown in the new supplementary figure 8, the fact that distribution models are very similar between the two categories indeed supports the hypothesis that both types of B2 elements may contribute to B2 RNA processing. Most importantly, the profile of B2 RNAs in genic regions shows that B2 RNA processing is not random but follows the same processing rules as B2 RNAs from Pol III promoters. Given the limitations posed by the repetitive nature of B2 RNAs, it remains difficult though to provide an exact number regarding the portion of B2 RNA fragments produced by each category and this is clearly noted in our revised discussion part. However, even the indication that B2 RNAs embedded in mRNAs may also play an important role in our model provides a new perspective that should be investigated further in future studies.

      Major point 2. The reviewer asks: “The direct regulation of SRGs by B2 RNA was not shown in their model systems for amyloid beta neuropathology. Rather, the authors' used the genes identified in their prior studies as B2 RNA-regulated, which I believe were in the NIH3T3 cell line. Given that transcription is highly cell-type specific, these genes might not be regulated by B2 RNA in mouse hippocampi or their cell culture model, despite the correlations shown. This needs to be addressed. Ideally, a targeted approach to show that transcription of even a couple genes in their system is indeed regulated by B2 RNA would provide stronger support for their conclusions.”

      We agree with the reviewer and we now provide a new figure (Fig.6D-F) with the targeted approach that this reviewer proposed. In particular, we have tested whether fragmentation of full length B2 RNAs is in connection with activation of target genes also in our biological system (HT22 cells) as it did in NIH/3T3 cells in our Cell paper. We now show in new Figure 6 that this is indeed the case.

      Major point 3. The reviewer proposes a number of additional information that needs to be provided: “The following bioinformatics analyses would strengthen their conclusions. This should be straightforward to do because it involves data they already have, and perhaps analyses they have already have performed.”

      a. Regarding the plot in Figure 3A (lower panel). The same plot should be shown for the 3m old and the 12m old APP mice (i.e. not just the 6m data). This would show the specificity of processing B2 RNA and that it indeed correlates with disease progression.

      We now provide this plot as new supplementary figure (Suppl. Figure 3). It shows that increased B2 RNA processing coincides only with the active neurodegeneration phase at 6 months and not the terminal stage.

      b. Regarding the plots of B2 RNA processing rate. This value could increase either due to more short RNAs or less full length RNA. Which is it for the 3m, 6m, and 12m APP mice? Showing the short and long B2 RNAs as boxplots (as opposed to only the processing rate) would address this and also provide additional insight into the regulation involved. The same applies to the data in Figure 6. (As an aside... do the authors mean processing ratio as opposed to rate? I'm not clear where the time component is coming into play to call this a rate.)

      Old Figure 6 is now Figure 7. We now provide all these figures that show that increase in processing ratio at 6 months is mainly due to increase in the processed fragments and not a decrease in full length B2 RNAs. For APP mice these are new Figures 3E and F, and for HT22 cells , these are new Supp. Figures 6B and C.

      c. The random genes in Figures 2E and 6E are plotted as heat maps, but statistical significance is hard to see. What do boxplots of the random genes look like, and is the significant difference between 6m old APP and 6m old WT then lost?

      Old Figure 2E is now new Suppl. Figure 1C, while old Figure 6E is now new Suppl. Figure 7C. We now provide these boxplots in new supplementary figures 1B and 7B.

      Major point 4. The reviewer comments: “ It is interesting that B2 RNA self-processing is enhanced by both Ezh2 and also Hsf1. It would strengthen the data to perform a control with a protein prepared more similarly to the Hsf1 (rather than PNK) to confirm that the enhanced B2 RNA breakdown is indeed attributable to Hsf1 and not a contaminant in the protein prep. Similarly, the authors should provide information on which RNA was added as the negative control for Hsf1-stimulated breakdown (i.e. the ~80 nt RNA).”

      This point is also discussed in Reviewer 1 point 7. The ribozyme endogenous activity of B2 RNA has been shown already in two previous studies that performed incubations with control RNAs and proteins. We are currently preparing and will provide these additional incubations as anew supplementary figure in the revised manuscript.

      **Minor comments:**

      1 . Regarding the GO analyses in Figure 1 (panels B, C, and D). I wasn't clear whether the authors are showing all statistically enriched terms, or only those relevant to neuronal processes and learning. I recommend showing a supplemental table with all terms that have an adjusted p value below a specified cut-off (e.g. 0.05).

      The statistical threshold used was an EASE score of 0.05 and all presented terms were above this threshold. In the initial manuscript we filtered only the top 5 terms in tissue enrichment and the top 10 terms for GO Biol process and Cell Compartment that had passed the threshold. We now provide all the terms that passed the threshold as a new Supplementary Table 2, including gene counts, exact gene numbers and related statistics.

      2 . The authors show several figures that are not new data (2B, 4A, 4B, Suppl. Fig 1 and 2). I think it would be more clear if these data were summarized and referenced in the results, rather than shown.

      Old Suppl. Fig1 and 2 that were results of previous studies or web resources directly available (such as Human Protein Atlas) have been now removed and they are now just referenced in the text. Old Figures 4A and 4B have been removed from the main figures but may be helpful to the readers if they are still available in the Supplement (currently as Suppl. Figure 4A and B), as not all users are familiar with the RNA-seq browsing tools of Allen Brain Atlas resources. Regarding figure 2B that contains data from our previous study on this exact cohort of mice: If the reviewer and the editor agree we recommend that it remains in the main figure (with the appropriate image credit citations), as it provides in an efficient way the clear connection between amyloid load and our results at the molecular level, and, most importantly, it clearly draws a line in amyloid pathology progression between 3m old and 6m old, that agrees with our findings in the RNA-seq data of these mice.

      3 . In Figure 3A the schematic shows that B2 is 155 nt, the plots in Figures 3A,B,C show B2 RNA is 120 nt, and Figure 5 shows the RNA is 188 nt. Can the authors please clarify these differences?

      The full length of B2 consensus sequence is 188nt and this is the one we use for the in vitro experiments. However, the structure of the B2 RNA has been resolved only for the first 155nt by the Kugel lab, and this is the only publicly available structure that we can reference in our figures. For the mapping of 5’ends of short fragments in Fig.3A we have used the same range tested in our Cell paper to maintain consistency of the results. The reason why this 120nt threshold was selected in the Cell paper was to exclude artifacts from short RNAs mapping partially in our metagene as well as downstream of those B2 elements that are shorter from the consensus sequence. We now explain in methods section these differences.

      4 . In the Methods section, the sequence of the g block template didn't contain the T7 promoter sequence that was used as the forward primer for PCR amplification?

      We have now included this sequence in lower case.

      5 . In Figure 6B, why were Hsf1 levels not decreased in the R treated cells after treatment with the LNA?

      Old Figure 6B is now new Figure 7B. Please see response to Reviewer 1, major point 12.

      Reviewer #2 (Significance (Required)):

      Finally, this reviewer generally remarks that “The models presented for the regulation of stress response genes (SRGs) in amyloid beta neuropathologies are compelling. As are the correlations they found between the progression of amyloid pathology, expression of genes thought to be regulated by B2 RNA, and the processing of B2 RNA. This is a unique direction of research for brain disease and represents an interesting conceptual advance. Most prior studies in this area use common model cell lines, and this lab seems well-positioned to unravel the proposed molecular mechanisms in neuronal systems.”

      We appreciate the encouraging comments made by this reviewer.

      REVIEWER 3

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      This manuscript describes a regulatory mechanism involving Hsf1 and B2 RNAs in the control of stress response genes (SRGs) during amyloid induced toxicity. In particular Hsf1, upregulated in 6m old APP mice and in HT22 cells treated with beta amyloid peptides, is shown to stimulate the B2 RNA destabilization leading to SRGs activation. While in healthy cells this upregulation can be reverted once the stimulus is removed, the pathological condition fuels the circuitry leading to p53 upregulation and neuronal cell death. The authors previously described the same mechanism acting during cellular heath shock response but in this case the protein identified as trigger of B2 RNA destabilization and SRGs activation was EZH2 (Zovoilis et al, 2016).

      This reviewer generally remarks that “Indeed, the first part of the manuscript describes additional analyses of the previous data that prompts further investigation on the potential role of B2 RNA in AD condition. Nevertheless, it is not clear how the prior findings obtained in not biologically related cellular models might be used to obtain helpful indication of B2 RNA neuronal activity.”

      We thank the reviewer for this comment. Indeed, the current study’s main aim was to expand the findings of our previous work on the role of B2 RNA in cellular response to thermal stress in NIH/3T3 cells to other types of cellular response to stress, in our case to amyloid toxicity and the resulting amyloid pathology in neural cells. Response to thermal stress (Heat Shock) has been used for years as a basic study model for cellular response to stress. Proteins and gene pathways initially identified in heat shock have been subsequently shown to play identical pro-survival roles in other biological systems and there are studies showing the role of Hsf1, heat shock related proteins and cell stress response pathways in neural cells and the mammalian brain (we will provide these references in the revised version). For example, pathways such as the MAPK pathway and early response genes, that constitute the basis of response to heat shock, have been shown in studies by us and others to be activated and play a critical role in hippocampal function. Thus, examining the role of B2 RNA in the context of neural response to stress constituted a natural continuation of our previous study in NIH/3T3 cells. The fact that the list of B2 RNA regulated SRGs was found to be highly enriched in neuronal tissue terms and cellular compartments related to neuronal functions plainly confirms the close relationship among cellular response pathways in the two biological systems. Due to these facts we were compelled to investigate in more detail our previous findings also in a neural cell model. However, as discussed in point 2 of Reviewer 2, the initial manuscript did not confirm the direct control of B2 RNA on expression of target genes also in our cellular model. This information is now part of the new figure 6 and we thank both reviewers for bringing this to our attention.

      The reviewer also remarks that “The research fields of non coding RNAs and neurodegeneration are attractive and challenging and, in my opinion, the molecular circuitry involving B2 RNAs might add important insights for understanding beta amyloid toxicity and neuronal death; however, the data provided are not in the shape making the manuscript suitable for publication: some controls are missing, the way the experiments are presented is not easy to follow and more importantly the authors does not provide any data (tables or lists) of the NGS experiments and the study lacks validation of them. Therefore, in my opinion the manuscript needs a profound revision before to be considered for publication in Review Commons.”

      Based on this reviewer’s and the other reviewers’ suggestions we now provide additional controls, detailed tables and gene lists, and qPCR validation of these results. We have also substantially revised the text in the first section of the results and beginning of the discussion, to make our rational for testing B2-SRGs more clear and easier to follow.

      **major concerns:**

      Major point 1. The reviewer asks: “The first paragraph of the Results is entirely dedicated to re-analyze the data previously published by the same group (Zovoilis et al., 2016). However, this is not adequately explained. In line with this, the table 1 is not required since the data are already provided by Zovoilis et al., 2016, unless the authors handled the data using additional new criteria that have to be explained.”

      We now explain our rational for using this data in more detail in the text. Please see also response to the general comment of this reviewer and response to the next point.

      In the Zovoilis et al (2016) study, the data presented did not include the list of regulated genes in a direct way but as part of the annotation of the B2 CHART peaks. This may pose difficulty to non-experts to extract the gene list from that data and we thought to include them as separate gene list here so that readers can directly use it for their analysis. Nevertheless, if the reviewer or the editor think that the list is redundant, we can surely omit it.

      In addition, the reviewer comments: “Moreover, Zovoilis and colleagues (2016) focused on SRGs regulated upon heat shock and using NIH/3T3 and HeLa cell lines, therefore, it is difficult to me understand how, searching for "cellular function connected with B2 RNA regulated SRGs", the list resulted enriched of neuronal tissue terms or cellular compartments related to neuronal functions. Please clarify this point since the following analyses are based on these findings.”

      Neural pathologies, such as amyloid pathology in brain, are often connected with cellular stress due to proteotoxicity. The ability of neural cells to respond to proteotoxicity challenges is connected with various molecular mechanisms, including stress related proteins that were firstly described in the context of heat shock. Thus, both contexts (heat shock and amyloid toxicity) refer to cellular response to stress, which explains why genes identified to be regulated during stress response in NIH/3T3 cells constitute part of the basic stress response toolbox that neural cells have also been described to possess. We have now modified the text accordingly to make our rational more clear.

      Major point 2. The reviewer comments: “In Figure 1F there is no arrow indicating that some of the SRGs regulate directly miR-34 as stated in the main text. Moreover, it is more appropriate to replace SRGs with learning‐associated genes both in the figure and in text (2nd paragraph of the results) since Zovoilis and colleagues focused on them. Finally, they did not show in their manuscript the rescue of p53 expression mediated by mir-34; indeed, for miR-34-p53 regulatory axis Zovoilis and colleagues referred to Peleg et al, 2010 and Yamakuchi & Lowenstein, 2009. Please fix all these concerns.”

      We have restructured the figure as suggested by the reviewer and made clear the distinction between learning genes and B2 RNA regulated SRGs (B2-SRGs) from the two different studies. In connection with point 1 of Reviewer 1, we believe that new Figure 1E, that includes the exact number of B2-SRGs that are learning associated, will represent more efficiently and accurately the data. We have also corrected in the text the citation regarding miR-34c and p53 in both the introduction and first section of the results (last paragraph).

      -The Fig.1A and Fig.1F are wrongly indicated at the end of the sentence "....levels of these genes are normally downregulated in 6m and 12m old mice compared to 3m old mice (p=0.02 and p=0.04, respectively)"; please correct this point.

      The error has been corrected.

      Major point 3. The reviewer comments regarding Figure 2:

      a) Since three mice for each condition have been used for the RNA seq analyses, please provide a blot with the Principal Component Analysis (PCA).

      Please see also response to minor point 3 of Reviewer 1. We provide the PCA plots for WT and APP mice in the new Supplementary Figure 9 and we also provide a comparison of the six month old mice with the HT cell samples as well as a correlation matrix for 6 month old mice in the same figure.

      b) Fig 2F comes first of Fig 2E in the text, however, I suggest to move this latter to supplementary material.

      Old figure 2E has now been moved to supplementary material as new Supplementary Figure 2C and we also provide in a boxplot the exact gene expression levels as new Supplementary Figure 2B.

      c) In general, this study lacks validation of the RNA-seq results. Western blot and/or qRTR-PCR to verify the variation of p53 and of some selected SRGs have to be provided.

      In the current revised version we already provide qPCRs for p53 and Hsf1 in APP mice and we will include additional genes in the final version.

      d) It is also not clear how the authors defined SRGs in the hippocampus: do they correspond to learning‐associated genes described by in Zovoilis et al, 2011 or to B2 RNA H/S regulated genes by Zovoilis et al, 2016?

      The way we presented B2 RNA SRGs in the results with regard to learning associated genes was indeed unclear. We now present the distinction between the two gene categories and their relationship as a new Fig.1E panel and we also provide detailed gene lists of common genes and the exact numbers (please see also response to Review 1, major point 1).

      -APP 12 month old mice show the sever phenotype of the terminal AD-like pathology, however this does not correlate with significant SRGs and B2 processing increase. Can the author make a comment on this?

      That’s a very important point and we thank the reviewer for raising this point. We now comment on this in the discussion part explaining how our findings are characteristic of the initial active neurodegeneration phase of amyloid pathology rather than more terminal stages.

      Major point 4: The reviewer comments regarding Figure 5:

      a) a gel with no-protein control for the time course of panel B was cited in the text but missing among the panels. Moreover, the time course shown in the graph in 5C does not correspond to the one in 5B.

      Indeed, the no-protein control time line should refer only to panel C and not to B, we have now corrected the text. Nevertheless, we now present in the new Supplementary Fig. 5 the gels, based on which the graph in panel C was calculated, including also the gel with no protein timeline. The time course shown in the initial 5C had been mislabeled. It has now been corrected. We apologize for this and we thank the reviewer for bringing this to our attention.

      b) 5G indicates that four samples for each condition have been analysed by RNA-seq, since they do not seem to be homogeneous please provide a PCA analysis together with the validation by qRT-PCR of a selected group of deregulated genes.

      Old Figure 5G is new Figure 6C. PCA analysis for these samples is now provided in Supplementary Figure 9 and qPCR validation of a number of these genes is provided in new Fig. 7E.

      Moreover, it is not clear whether all the genes shown in the heatmap or a number of them, as stated in the text, were found upregulated in 6m old APP mice. Please clarify this point and modify the figure and the text accordingly. A Venn diagram showing the overlap between genes upregulated in 42vsR treatment and those upregulated in 6m old APP mice might help the comprehension of the experiment.

      Please see response to Reviewer 1, point 9. We now provide as new supplementary tables the exact overlapping lists and mention these numbers in the text.

      Major point 5: The reviewer comments regarding Figure 6 (now labeled as Fig.7):

      a) The evaluation of the levels of Hsf1 mRNA and protein upon LNA transfection is missing for both R and 42 treated HT22 cells. From TPM in panel B, Hsf1 downregulation seems to have been more effective in 42 than in R condition. This would mess up the interpretation of the data.

      We now provide qPCR data for Hsf1 gene expression levels which confirm the ones from the RNAseq. The reason why Hsf1 downregulation seems not to affect the R condition is discussed in our response to Reviewer 1, major point 12, and the respective explanation is provided in the revised text.

      b) Again, in this case any validation of the RNA seq data is provided (any B2 regulated SRGs).

      Now, we provide qPCR data for these genes in Fig.7B and new Fig.7E

      c) Panels E and F should be swapped or panel E moved to supplementary material.

      Panel E is now moved to supplementary material as new Suppl. Figure 7C.

      Major point 6. The reviewer comments: “In a previous paper the authors discovered B2 RNAs as a class of transcripts bound to EZH2 and this interaction leads to B2 RNA destabilization in heath shock (H/S) condition. The authors also conclude that the genes controlled by B2 RNAs may not overlap with the ones controlled by Hsf1 during H/S. The author should make a comment on this explaining why during H/S B2 RNAs work independently from Hsf1 and on different target SRGs while, during beta amyloid stress ,the two act together on the same SRGs. Moreover, as shown for EZH2, Hsf1-RIP experiment should be performed in order to confirm the direct involvement of Hsf1 in the SRGs-B2 destabilization.”

      In the last two paragraphs of our discussion we indicate that B2 RNA regulation is a new process implicated in the response to stress in amyloid pathology but certainly not the only one. We have revised the text in this part accordingly in the revised version to prevent any confusion. We are currently performing a series of RIP-seq experiments with various antibodies. As, to our knowledge, there is no prior published study performing RIP-seq or CLIP-seq for any tissue using Hsf1 antibodies, the success of this experiment is not guaranteed and depends on the existence of appropriate antibodies.

      Major point 7. The reviewer comments: “There is any table listing the results of the RNA seq experiments performed in this paper: control vs APP 3-6-12 m old mice and in R vs 42 treated HT22 cells in presence or absence of LNA against Hsf1. Please provide these data.”

      We now provide these lists as new supplementary tables. Please see response to major points 1 and 9 of reviewer 1.

      Major point 8. The reviewer comments: “In the discussion the authors claim that healthy cells are able to restore the expression of Hsf1, SRGs and B2 RNA upon removal of the stress. Since there are evidence for the rescue of SRGs and B2 RNA expression post H/S, no data are available for Hsf1, SRGs and B2 RNA upon the removal of 1-42 beta amyloid peptide. This might be a nice information to add to the manuscript.”

      This would indeed substantiate further our results in our HT22 cell model. We have now performed this experiment, in which HT-22 cells were removed from the amyloid 42 (and the respective R peptide control) and left to recover for 12 hours before estimating through RT-qPCR the Hsf1 levels ( see graph below, REC corresponds to recovered HT-22 cells). Hsf1 levels in 42-REC have returned to the same levels as in R, p We currently perform the RT-qPCRs of these samples also for B2-SRGs and will include them in the final version as a supplementary figure.

      **Minor criticisms:**

      -In the introduction the reference Yamakuchi M and Lowenstein CJ, (2009) MiR‐34, SIRT1 and p53: the feedback loop. Cell Cycle, should be added in the sentence: "In contrast, hippocampi of mouse models of amyloid pathology and post- mortem brains of human patients of AD.....and neural death (Zovoilis et al., 2011)."

      We have now changed the text at that point accordingly and also updated the legend of Figure 1F that also refers to this same study.

      -Authors refer to Hernandez et al., 2020 to state that B2 self cleavage is stimulated by some proteins however, Hernandez and colleagues studied only the effect of EZH2 protein. Please rephrase the sentence accordingly.

      Text has been modified accordingly.

      -Indicate a reference for the sentence: "......Ezh2, was reported as being responsible for the B2 RNA accelerated destabilization and processing during response to stress."

      The respective citation was added.

      -The format of many references is not consistent and has to be revised.

      We have switched to the Vancouver style. Some references in the legend and methods sections are referred independently from EndNote in case these text sections have to be moved to supplement in the final version in order to not create inconsistencies with endnote.

      Reviewer #3 (Significance (Required)):

      Finally, this reviewer generally remarks that “The research fields of non coding RNAs and neurodegeneration are attractive and challenging and, in my opinion, the molecular circuitry involving B2 RNAs might add important insights for understanding beta amyloid toxicity and neuronal death.

      However, this manuscript does not really add technical advances since the authors employed experimental approaches and bioinformatic analyses previously published by Zovoilis and colleagues in 2011 and 2016.”

      Our aim in the current manuscript was not to introduce a new method or experimental approach but rather to study the mechanisms behind B2 RNA regulation of gene expression in neural cells and particularly in amyloid pathology. Nevertheless, the current study constitutes the first reported short-RNA seq in this tissue and offers for the first time the ability to study B2 RNA processing in this tissue which is not possible with standard small and long RNA-seq.

      The reported findings might of interest of an audience of experts in non coding RNAs and neurodegeneration. The area of my expertise almost regards the biology of non coding RNAs from biogenesis to function manly focusing on neuronal and muscular systems both in physiological and pathological conditions.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      This manuscript describes a regulatory mechanism involving Hsf1 and B2 RNAs in the control of stress response genes (SRGs) during amyloid induced toxicity. In particular Hsf1, upregulated in 6m old APP mice and in HT22 cells treated with beta amyloid peptides, is shown to stimulate the B2 RNA destabilization leading to SRGs activation. While in healthy cells this upregulation can be reverted once the stimulus is removed, the pathological condition fuels the circuitry leading to p53 upregulation and neuronal cell death. The authors previously described the same mechanism acting during cellular heath shock response but in this case the protein identified as trigger of B2 RNA destabilization and SRGs activation was EZH2 (Zovoilis et al, 2016). Indeed, the first part of the manuscript describes additional analyses of the previous data that prompts further investigation on the potential role of B2 RNA in AD condition. Nevertheless, it is not clear how the prior findings obtained in not biologically related cellular models might be used to obtain helpful indication of B2 RNA neuronal activity. The research fields of non coding RNAs and neurodegeneration are attractive and challenging and, in my opinion, the molecular circuitry involving B2 RNAs might add important insights for understanding beta amyloid toxicity and neuronal death; however, the data provided are not in the shape making the manuscript suitable for publication: some controls are missing, the way the experiments are presented is not easy to follow and more importantly the authors does not provide any data (tables or lists) of the NGS experiments and the study lacks validation of them. Therefore, in my opinion the manuscript needs a profound revision before to be considered for publication in Review Commons.

      major concerns:

      -The first paragraph of the Results is entirely dedicated to re-analyze the data previously published by the same group (Zovoilis et al., 2016). However, this is not adequately explained. In line with this, the table 1 is not required since the data are already provided by Zovoilis et al., 2016, unless the authors handled the data using additional new criteria that have to be explained. Moreover, Zovoilis and colleagues (2016) focused on SRGs regulated upon heat shock and using NIH/3T3 and HeLa cell lines, therefore, it is difficult to me understand how, searching for "cellular function connected with B2 RNA regulated SRGs", the list resulted enriched of neuronal tissue terms or cellular compartments related to neuronal functions. Please clarify this point since the following analyses are based on these findings.

      -In Figure 1F there is no arrow indicating that some of the SRGs regulate directly miR-34 as stated in the main text. Moreover, it is more appropriate to replace SRGs with learning‐associated genes both in the figure and in text (2nd paragraph of the results) since Zovoilis and colleagues focused on them. Finally, they did not show in their manuscript the rescue of p53 expression mediated by mir-34; indeed, for miR-34-p53 regulatory axis Zovoilis and colleagues referred to Peleg et al, 2010 and Yamakuchi & Lowenstein, 2009. Please fix all these concerns.

      -The Fig.1A and Fig.1F are wrongly indicated at the end of the sentence "....levels of these genes are normally downregulated in 6m and 12m old mice compared to 3m old mice (p=0.02 and p=0.04, respectively)"; please correct this point.

      -Figure 2:

      a) Since three mice for each condition have been used for the RNA seq analyses, please provide a blot with the Principal Component Analysis (PCA).

      b) Fig 2F comes first of Fig 2E in the text, however, I suggest to move this latter to supplementary material.

      c) In general, this study lacks validation of the RNA-seq results. Western blot and/or qRTR-PCR to verify the variation of p53 and of some selected SRGs have to be provided.

      d) It is also not clear how the authors defined SRGs in the hippocampus: do they correspond to learning‐associated genes described by in Zovoilis et al, 2011 or to B2 RNA H/S regulated genes by Zovoilis et al, 2016?

      -APP 12 month old mice show the sever phenotype of the terminal AD-like pathology, however this does not correlate with significant SRGs and B2 processing increase. Can the author make a comment on this?

      -Figure 5:

      a) a gel with no-protein control for the time course of panel B was cited in the text but missing among the panels. Moreover, the time course shown in the graph in 5C does not correspond to the one in 5B.

      b) 5G indicates that four samples for each condition have been analysed by RNA-seq, since they do not seem to be homogeneous please provide a PCA analysis together with the validation by qRT-PCR of a selected group of deregulated genes. Moreover, it is not clear whether all the genes shown in the heatmap or a number of them, as stated in the text, were found upregulated in 6m old APP mice. Please clarify this point and modify the figure and the text accordingly. A Venn diagram showing the overlap between genes upregulated in 42vsR treatment and those upregulated in 6m old APP mice might help the comprehension of the experiment.

      -Figure 6:

      a) The evaluation of the levels of Hsf1 mRNA and protein upon LNA transfection is missing for both R and 42 treated HT22 cells. From TPM in panel B, Hsf1 downregulation seems to have been more effective in 42 than in R condition. This would mess up the interpretation of the data.

      b) Again, in this case any validation of the RNA seq data is provided (any B2 regulated SRGs).

      c) Panels E and F should be swapped or panel E moved to supplementary material.

      -In a previous paper the authors discovered B2 RNAs as a class of transcripts bound to EZH2 and this interaction leads to B2 RNA destabilization in heath shock (H/S) condition. The authors also conclude that the genes controlled by B2 RNAs may not overlap with the ones controlled by Hsf1 during H/S. The author should make a comment on this explaining why during H/S B2 RNAs work independently from Hsf1 and on different target SRGs while, during beta amyloid stress ,the two act together on the same SRGs. Moreover, as shown for EZH2, Hsf1-RIP experiment should be performed in order to confirm the direct involvement of Hsf1 in the SRGs-B2 destabilization.

      -There is any table listing the results of the RNA seq experiments performed in this paper: control vs APP 3-6-12 m old mice and in R vs 42 treated HT22 cells in presence or absence of LNA against Hsf1. Please provide these data.

      -In the discussion the authors claim that healthy cells are able to restore the expression of Hsf1, SRGs and B2 RNA upon removal of the stress. Since there are evidence for the rescue of SRGs and B2 RNA expression post H/S, no data are available for Hsf1, SRGs and B2 RNA upon the removal of 1-42 beta amyloid peptide. This might be a nice information to add to the manuscript.

      Minor criticisms:

      -In the introduction the reference Yamakuchi M and Lowenstein CJ, (2009) MiR‐34, SIRT1 and p53: the feedback loop. Cell Cycle, should be added in the sentence: "In contrast, hippocampi of mouse models of amyloid pathology and post- mortem brains of human patients of AD.....and neural death (Zovoilis et al., 2011)."

      -Authors refer to Hernandez et al., 2020 to state that B2 self cleavage is stimulated by some proteins however, Hernandez and colleagues studied only the effect of EZH2 protein. Please rephrase the sentence accordingly.

      -Indicate a reference for the sentence: "......Ezh2, was reported as being responsible for the B2 RNA accelerated destabilization and processing during response to stress."

      -The format of many references is not consistent and has to be revised.

      Significance

      The research fields of non coding RNAs and neurodegeneration are attractive and challenging and, in my opinion, the molecular circuitry involving B2 RNAs might add important insights for understanding beta amyloid toxicity and neuronal death. However, this manuscript does not really add technical advances since the authors employed experimental approaches and bioinformatic analyses previously published by Zovoilis and colleagues in 2011 and 2016.

      The reported findings might of interest of an audience of experts in non coding RNAs and neurodegeneration.

      The area of my expertise almost regards the biology of non coding RNAs from biogenesis to function manly focusing on neuronal and muscular systems both in physiological and pathological conditions.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary:

      This manuscript follows from previous work by the corresponding author showing that SINE-encoded B2 RNAs function as regulators of the expression of stress response genes (SRGs). Specifically, stimulus triggers the processing of repressive B2 RNAs that are bound at the SRGs, thereby activating SRG transcription. In this work, the authors investigate whether a similar mechanism might be controlling the expression of genes in models of amyloid beta neuropathology (i.e. mouse hippocampi from an amyloid precursor protein knock-in mouse model, and a cell culture model of amyloid beta toxicity). They performed RNA-seq in these models. Their data show a correlation between the progression of amyloid pathology, expression of genes thought to be regulated by B2 RNA, and the processing of B2 RNA. In addition, they show biochemical data supporting a role for Hsf1 in enhancing the processing of B2 RNA. Knockdown of Hsf1 also reduced B2 RNA processing and the expression of SRGs.

      Major comments:

      1 . In the RNA-seq data one cannot distinguish between Pol III transcribed B2 RNA and Pol II transcribed B2 RNA (typically embedded within introns and UTRs of mRNAs). The models they present, and the structures they show, clearly imply regulation by Pol III transcribed B2 RNA. However, there is no way to know that the short B2 RNAs they sequence aren't coming from degraded mRNAs. This needs to addressed. Minimally, in writing as a caveat of their model. Ideally, it would be addressed experimentally.

      2 . The direct regulation of SRGs by B2 RNA was not shown in their model systems for amyloid beta neuropathology. Rather, the authors' used the genes identified in their prior studies as B2 RNA-regulated, which I believe were in the NIH3T3 cell line. Given that transcription is highly cell-type specific, these genes might not be regulated by B2 RNA in mouse hippocampi or their cell culture model, despite the correlations shown. This needs to be addressed. Ideally, a targeted approach to show that transcription of even a couple genes in their system is indeed regulated by B2 RNA would provide stronger support for their conclusions.

      3 . The following bioinformatics analyses would strengthen their conclusions. This should be straightforward to do because it involves data they already have, and perhaps analyses they have already have performed.

      a. Regarding the plot in Figure 3A (lower panel). The same plot should be shown for the 3m old and the 12m old APP mice (i.e. not just the 6m data). This would show the specificity of processing B2 RNA and that it indeed correlates with disease progression.

      b. Regarding the plots of B2 RNA processing rate. This value could increase either due to more short RNAs or less full length RNA. Which is it for the 3m, 6m, and 12m APP mice? Showing the short and long B2 RNAs as boxplots (as opposed to only the processing rate) would address this and also provide additional insight into the regulation involved. The same applies to the data in Figure 6. (As an aside... do the authors mean processing ratio as opposed to rate? I'm not clear where the time component is coming into play to call this a rate.)

      c. The random genes in Figures 2E and 6E are plotted as heat maps, but statistical significance is hard to see. What do boxplots of the random genes look like, and is the significant difference between 6m old APP and 6m old WT then lost?

      4 . It is interesting that B2 RNA self-processing is enhanced by both Ezh2 and also Hsf1. It would strengthen the data to perform a control with a protein prepared more similarly to the Hsf1 (rather than PNK) to confirm that the enhanced B2 RNA breakdown is indeed attributable to Hsf1 and not a contaminant in the protein prep. Similarly, the authors should provide information on which RNA was added as the negative control for Hsf1-stimulated breakdown (i.e. the ~80 nt RNA).

      Minor comments:

      1 . Regarding the GO analyses in Figure 1 (panels B, C, and D). I wasn't clear whether the authors are showing all statistically enriched terms, or only those relevant to neuronal processes and learning. I recommend showing a supplemental table with all terms that have an adjusted p value below a specified cut-off (e.g. 0.05).

      2 . The authors show several figures that are not new data (2B, 4A, 4B, Suppl. Fig 1 and 2). I think it would be more clear if these data were summarized and referenced in the results, rather than shown.

      3 . In Figure 3A the schematic shows that B2 is 155 nt, the plots in Figures 3A,B,C show B2 RNA is 120 nt, and Figure 5 shows the RNA is 188 nt. Can the authors please clarify these differences?

      4 . In the Methods section, the sequence of the g block template didn't contain the T7 promoter sequence that was used as the forward primer for PCR amplification?

      5 . In Figure 6B, why were Hsf1 levels not decreased in the R treated cells after treatment with the LNA?

      Significance

      The models presented for the regulation of stress response genes (SRGs) in amyloid beta neuropathologies are compelling. As are the correlations they found between the progression of amyloid pathology, expression of genes thought to be regulated by B2 RNA, and the processing of B2 RNA. This is a unique direction of research for brain disease and represents an interesting conceptual advance. Most prior studies in this area use common model cell lines, and this lab seems well-positioned to unravel the proposed molecular mechanisms in neuronal systems.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      B2 RNAs, encoded from SINE B2 elements has been directly implicated in stress response by its inherent ability to bind RNA Pol II and suppress stress response genes (SRG) in homeostatic conditions. However, upon stimuli, B2 RNAs are cleaved and degraded, resulting in the release of RNA pol II and upregulation of SRGs. Previous work from the senior author identified PRC2 component EZH2 to be the B2 RNA processing factor, cleaving B2, and releasing POL2. SRGs are upregulated upon stress, for example in age-associated neuropathologies like Alzheimer's disease (AD). Considering that the hippocampus is a primary target of amyloid pathologies as well as since SRGs are suggested to be key for the function of a healthy hippocampus, the authors set to understand the role of B2 RNAs that are linked to SRG regulation in the mouse hippocampus with amyloid pathology. They use disease-relevant in vivo and in vitro models combined with unbiased RNA seq data analysis for this endeavor, which indicates the potential relevance of B2 RNAs in APP mediated neuronal pathologies in mice as well as identifies Hsf1 as the factor cleaving B2 RNAs in the hippocampus. The work is interesting and identification of Hsf1 as the processing factor for B2 RNAs in the hippocampus is significant. I would like to credit the authors for their elegant in vivo experimental design in Figure 2. However, I find some of the conclusions to be overstated and I would like to bring the following concerns I have to your attention:

      Major comments:

      1 . In figure 1, the authors indicate a strong connection between B2 RNA regulated SRGs and learning and memory. In figure 2, they identify the SRGs in the hippocampus, please provide a direct comparison of learning and memory associated SRGs and the SRGs they identify in figure 2 that are significantly upregulated in APP mice in 6 months.

      2 . To better understand the data in the context of hippocampal function, please include functional annotation of SRGs they identified in Figure 2F as they do it in Figure 1 (desirably for each time point, at least for 6M). How many of the SRGs they identify in Figure 1 are part of Figure 2F? Please include functional annotation of significantly upregulated B2 regulated SRGs in Fig2 and compare them with that of Figure 1.

      3 . In figure 3, the authors report that the B2 processing rates are high at the 6M time point at in hippocampi of the APP mice. Please include the levels of unprocessed and processed B2 RNAs in these samples along with this figure, without which it is difficult to gauge the significance of its correlation with SRGs in Figure 2.

      4 . What is the % of B2 regulated SRGs that are hsf1 bound in Figure 4C? What is there dynamics in the wild type and APP hippocampi?

      5 . What is the distribution of Hsf1 binding sites on (a) non-B2 regulated SRGs and (b) non-SRG genes in hippocampi?

      6 . In Figure 4D, the 3months old Wt HSF1 levels are high, yet B2 processing (Figure 3E) is low. Please comment.

      7 . While the authors show in vitro cleavage of B2 RNA by Hsf1, the experiment lacks controls to be conclusive. At least, please include a similar size protein as HSF1 with no-known RNA binding activity and a similar size protein with RNA binding activity as controls in 5A. Please justify the use of PNK as the control protein. Please include the use domain-based deletions of Hsf1 to map the region of HSF1 that is binding and potentially cleaving the B2 RNA. Please include an RNA of similar size and Antisense-B2 RNA to show the specificity of the Hsf1 based cleavage of B2 RNA. Without these controls, the conclusions in Figure 5 cannot be substantiated.

      8 . The authors should show that the incubated APP peptides are taken up by the cells (experiments in Figure 5F and Figure 6).

      9 . Please provide the list, functional annotation, and % of the SRGs upregulated upon incubation with APP in HT22 cells in comparison to 6month old APP mice. Comment on learning-related Genes.

      10 . The authors should show the efficient downregulation of Hsf1 (protein) upon anti-Hsf1 LNA transfection.

      11 . Please present the total B2 RNA levels for conditions in Figure 6C.

      12 . Hsf1 levels are not significantly downregulated in Control cells which were inoculated with the reverse APP peptide. Please comment.

      13 . Please compare and contrast the % of genes, the overlap, and the functional distinctions in 6F to that of 5G and Figure1. What are the genes that are common between Figure1, and that are specifically upregulated upon Anti-Hsf1 LNA transfection along with 1-42 APP. What is % of the occurrence of B2 binding sites in those genes? What are their functional annotations and what is their connection to learning, memory, and cell survival?

      Minor.

      1 . Please include TPM/ FPKM values for hippocampal markers as control in Figure 2 to do justice to the hippocampus specific RNA seq conducted by the Authors.

      2 . In figure 2D the authors show that B2 RNA regulated SRGs in the 3 months' wild type mice are significantly high. P53 has been reported to be high in young wild types hippocampus, but not SRGs in my opinion. The authors should comment on this.

      3 . In figure 2F, under the 6m APP condition, the replicate 3 looks substantially different from the other replicate. This can significantly impact the analysis and conclusions made. Either remove that replicate and present the analysis without it or please provide a valid explanation. To make the data more valid, please provide hierarchical clustering of the entire data, the non-B2 regulated genes and the B2 regulated SRGs. In Figure 2C RNA seq data is represented in TPM while its FPKM in Figure 2D. Figure 2: the number of replicates in the case of 3-month-old wild types only 2. Please specifically denote it and comment why only 2 replicates are provided

      4 . Considering that p53 and SRGs are significantly upregulated in 6months in the APP model, it would be great if (allowing that these samples are still available) the authors can include a staining for apoptotic markers, for example, Active Casp3 or similar. This will allow us to better gauge the gene expression changes presented by the authors especially regarding SRGs.

      5 . Under subheading: Hsf1 accelerates B2 RNA processing, 3rd paragraph when the authors comment on known hsf1 binding sites on SRG genes, please correct from: Increased Hsf1-binding was found.... "To the increased number of hsf1 binding sites were found", unless the authors would like to show increased Hsf1 binding by performing CHIP-seq for Hsf1 in the hippocampus at least at the 6-month time point between Wt and APP mice.

      Significance

      B2 RNAs, encoded from SINE B2 elements has been directly implicated in stress response by its inherent ability to bind RNA Pol II and suppress stress response genes (SRG) in homeostatic conditions. However, upon stimuli, B2 RNAs are cleaved and degraded, resulting in the release of RNA pol II and upregulation of SRGs. Previous work from the senior author identified PRC2 component EZH2 to be the B2 RNA processing factor, cleaving B2, and releasing POL2. SRGs are upregulated upon stress, for example in age-associated neuropathologies like Alzheimer's disease (AD). Considering that the hippocampus is a primary target of amyloid pathologies as well as since SRGs are suggested to be key for the function of a healthy hippocampus, the authors set to understand the role of B2 RNAs that are linked to SRG regulation in the mouse hippocampus with amyloid pathology. They use disease-relevant in vivo and in vitro models combined with unbiased RNA seq data analysis for this endeavor, which indicates the potential relevance of B2 RNAs in APP mediated neuronal pathologies in mice as well as identifies Hsf1 as the factor cleaving B2 RNAs in the hippocampus.

      The work is interesting and identification of Hsf1 as the processing factor for B2 RNAs in the hippocampus is significant. I would like to credit the authors for their elegant in vivo experimental design in Figure 2.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      We thank the reviewers for their useful suggestions to improve the manuscript and their support for publication. We have addressed all the comments that have been raised and carried out the suggested additional analyses, resulting in a significantly improved revised version of the manuscript. We provide hereafter a detailed point-by-point response to all questions and comments of the three reviewers.

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      Centriole structure has been an attractive but challenging research topic for years. Pierre Gonczy's group has been working on its structure using cryo-electron tomography (cryo-ET). While the axoneme, which has longitudinal periodicity, was analyzed by several groups by cryo-ET for more than a decade, cryo-ET study on the centriole suffers from poor signal to noise ratio due to its limited length and thus fewer periodicity. They chose the centriole of flagellate Trichonympha, which have exceptionally long centrioles and thus offer opportunity of relatively straightforward sub-tomogram averaging. Their approach has been successful, and they revealed intermediate resolution structure of the cartwheel, key of 9-fold symmetry formation, and it's joint to triplet microtubules (Guichard et al. 2012, 2013, 2018).

      In this work, they employed modern state-of-art cryo-ET technique, such as direct electron detection and 3D image classification to upgrade our knowledge of centriole structure. In their past works, the central hub of the cartwheel, made of SAS-6 protein forming 9-fold complex, was described as an 8nm periodic object. With improved spatial resolution, they provided further detail with clear polarity, which will deepen our thought about the initial stage of ciliogenesis. They also compared two Trichonympha species (spp and agilis) as well as another flagellate, Teranympha mirabilis, and extended their intriguing evolutional and mechanical hypotheses based on structural differences.

      Despite improved spatial resolution, it is still not possible to identify proteins in the cryo-ET map (cellular cryo-ET will not reach such high resolution in the near future). Therefore, this work is rather geometrically descriptive, which will inspire molecular biologists to identify molecules by other methods. Nevertheless, this work demonstrated capability of cellular cryo-ET, especially analysis of structural heterogeneity. Thus, while biological topics handled are rather specialized for cilia from flagellate, this work will attract attention of any biologist interested in molecular structure in vivo. It is worth for publication in a high Journal after addressing the points below. This reviewer believes that the authors can address these points easily with additional analysis.

      We are grateful to the reviewer for the favorable evaluation and the many valuable suggestions, in particular concerning the processing pipeline, which we addressed by additional analyses, as detailed below.

      Major points:

      1. Entire scheme A graphic diagram of the entire cartwheel area, summarizing this work, is necessary for the readers' understanding (similar to Fig.6 of the other manuscript, Klena et al.).

      We thank the reviewer for this interesting suggestion, which we fully adhere to. As a result, we have generated a graphical summary of the work, which is shown in the new Figure panels 6B-F. Moreover, Figure 6A provides an evolutionary perspective regarding the presence of the CID and of what is now referred to as the fCID (filamentous CID, previously: FLS, see response to reviewer 3). This also helps to link our findings with the companion manuscript by Klena et al. This new Figure 6 is referred to extensively in the discussion of the revised manuscript (pages 13-16).

      Then average scheme should be shown in more detail, especially assumption of periodicity, Materials and Methods. The cartwheel hub was averaged with 25nm periodicity (as discussed below). Was the pinhead averaged with 16nm (as detected by FFT in Fig.S2L)? How about the triplet?

      This reviewer is not completely sure if the longitudinal averaging strategy is justifiable. Since periodicity of each domain is not trivial, logically the initial average must be done with the size of least common multiple (or larger). It is likely 96nm, assuming 25nm of the central hub is 3 times of microtubule periodicity and 16nm of the pinhead is twice of MT. 96nm average should be possible with a long cartwheel in this work. Alternative, in case periodicity is independent of MT and thus there is no least common multiple, is random picking and classification mentioned in "4. Periodicity". This should also be possible, since they can pick enough number of particles from long cartwheels.

      We apologize that the initial version of the manuscript was not sufficiently clear regarding the averaging pipeline that was pursued. To rectify this, we now provide a new Figure S1B to graphically explain the approach followed for STA. As depicted in this figure panel, the step size for sub-volume extraction was 25 nm both centrally and peripherally. This step size was selected because it corresponds to ~3x the major periodicity of ~8.5 nm observed in the power spectra of the sub-volumes. The 25 nm step size is larger than that previously used (i.e. 17 nm in Guichard et al. 2013), in order to identify potential features with larger periodicities. The fact that the step size was of 25 nm in all cases is now mentioned explicitly in the Materials and Methods section of the revised manuscript (line 649).

      We agree with the reviewer that 96 nm averaging is possible given the long cartwheel analyzed here, and such a piece of data was in fact included in the original submission, although with a different purpose. Indeed, we carried out STA using ~(100 nm)3 sub-volumes (with binning 3 to reduce computational time), the results of which are reported in Figure S7 (previously Fig. S6). For the purpose of this analysis, we focused on the lateral organization of the cartwheel, but did not use this dataset to explore other periodicities because of the limitations inherent to a binning 3 data set.

      • Classification*

      The authors analyzed structural heterogeneity inside the cartwheel hub, employing reference-free classification by Relion software. The program reveals multiple coexisting structures - two from Trichonympha agilis and three from Teranympha, respectively. Whereas this is an exciting finding and shows future research direction of this field, interpretation of this classification must be done carefully. ** It is puzzling that major (55%) population of T. agilis shows more ambiguous features than the minor population (45%), while spatial resolutions by FSC are not so different - for example, Fig.2H vs Fig.S5C. In case of Teranympha, it is even more drastic - Fig.4D (major class) seems blurred along the centriolar axis, compared to Fig. 4E (minor class). This reviewer is afraid that these "major" classes might contain more than one structure and after subaveraging be blurred in detailed features. The apparent good spatial resolution could be explained, when two structures coexist and subtomograms are aligned within each subclass. Probably lower resolution at the spoke region of the major class (Fig.S2A) than that of the minor class (Fig.S2D) is a sign of heterogeneity within this class. Another risk could be subtomograms with poorer S/N being categorized to one class (due to lack of feature to be properly classified). Fig.S5F (black dots localized in one tomogram) raised this concern.

      The following investigation will help to solve this issue. 1. Extract and re-classify subtomograms belonging to the major population. 2. Direct observation of tomograms. The authors could plot two classes of Teranympha (as they did for T. agilis in Fig.S5) and find features of the cylindrical cartwheel hub in two conformations (as shown Fig.4DE). Since such a feature was directly observed in tomograms from the other manuscript (left panels of Fig.S6AC in Klena et al.), it should be possible in this work as well.

      We agree with the reviewer that the interpretation of the classification must be done with care, and share her/his interest in better understanding the structural variability between cartwheels classes in T. agilis and T. mirabilis. Although poor S/N may in theory result in erroneous joint classifications, we note that all maps in the original submission stemmed from extensive focused 3D classification, which removed defective and spurious sub-volumes, nevertheless defining distinct classes in the cases reported. Obviously, however, we cannot exclude that much larger data sets and future software advances may lead to the identification of additional features that would allow further sub-classes to be identified.

      Regardless, we followed the two suggestions the reviewer offered to us and have (1) extracted and re-classified sub-tomograms belonging to the major populations and (2) undertaken a direct observation of tomograms. These two points are developed in turn below.

      (1) We have performed a further round of classification of the major populations in T. agilis (55 % class) and T. mirabilis (64 % class), to assess whether additional sub-classes might be identified and thus help further improve the quality of the central cartwheel map. However, this additional round did not yield new sub-classes nor notable improvement in the map quality as judged by visual inspections. We show in Rebuttal Figure 1 a comparison in each case of the original STA and the corresponding STA upon such re-classification. Importantly, all conclusions spelled out in the original submission hold upon further re-classification, indicating that the initial classification converged to the best map quality based on the current data set and available computational resources.

      (2) We have followed the suggestion of the reviewer and now show raw tomograms to confirm that the classes correspond to bona fide structures and not to processing artefacts (new Figures S1C-F). The resulting new Figure S1D for instance shows that the striking variations observed between classes in the T. agilis STA are also visible in the raw tomogram. The more subtle variations among T. mirabilis classes are more difficult to observe in the raw tomogram, but inherent variations that reflect the presence of two classes are nevertheless observed.

      Furthermore, following the reviewer’s suggestion, we now mapped the distribution of the two T. mirabilis cartwheel classes onto tomograms, revealing that both classes can occur next to each other within the same centriole (new Figure S8E).

      • Periodicity mismatch*

      In Fig. 2CD, periodicity of CID has discrepancy from that of the stacked SAS-6 ring (8.5nm and 8.0nm). Do the authors think this is a significant difference or within an error? The same question can occur to other subtomogram averages. It would be nice to show errors as shown in their other manuscript (Fig.3C of Klena et al.) and clarify their idea. If it is systematic difference of periodicity between the stacked ring and CID, this shift will be accumulated through the entire cartwheel region - after 100nm, 8.5nm/8.0nm difference can be accumulated to ~6nm, which should change the entire view of the subtomogram - and the main factor to be classified (periodicity mismatch). This artifact (or influence) should be removed (or separately evaluated) by masking CID (out and in) and run classification separately. By clarifying this, the quality of the major subaverages (mentioned in the previous paragraph) could be improved.

      The reviewer wonders whether there might be a periodicity discrepancy within one map, for instance between CID and spokes in the T. spp. cartwheel map (Fig. 2C and Fig. 2D). Here, the periodicity determined from the STA maps is 8.5 ± 0.2 nm (SD, N=4) for the CID and 8.0 ± 1.5 nm (SD, N=2) for the spokes. Based on these standard deviations, there is indeed no significant difference between the two, and thus no periodicity discrepancy. The same applies for measurements in T. agilis and T. mirabilis. The SDs were reported already in the figure legends of the original submission, and we would prefer to leave them there if possible and not mention them in the figures, which are pretty busy as is. We apologize if this was not clear enough in the initial manuscript. Likewise, one may wonder whether there might be periodicity discrepancies between structures from distinct maps, for instance between CID and A-links from T. spp. (Fig. 2C and Fig. 3D). Again, the measurements are within error, since the distance between adjacent CIDs is 8.5 ± 0.2 nm (N=4) and between adjacent A-links 8.4 ± 0.4 nm (N=6); a similar conclusion applies for the corresponding measurement comparisons in T. agilis and T. mirabilis. The figure legends have been altered in the revised manuscript to spell out that there are no significant differences between periodicities (lines 856-858).

      Furthermore, we would like to stress that, by definition, STA value are average distances. For instance, in the case of T. spp., the central cartwheel STA was obtained from 511 sub-volumes, and thus the reported N=2 represents the average distance from 511 sub-volumes. Since this is an average, errors can therefore not accumulate over longer distances. This point has also been clarified in the figure legends (line 856-858).

      • Periodicity*

      They averaged subtomograms extracted with spacing of 252A with initial average as the first template (p.18 Line22). This means they assumed 25nm periodicity from the beginning and excluded different or larger unit size (if they take search range wide, they could detect difference periodicity, but will still be biased by initially assumed 25nm). 25nm average allowed them to see more detail than before (when they assumed 8nm periodicity), but there is still a risk of bias from references. To avoid this risk, this reviewer would propose classification of randomly extracted (but of course along the cylindrical hub or along the triplet microtubules, so one-dimensionally random picking) subtomograms. This experiment will end up with multiple sub-averages, which are 25nm (or multiple times of that) shifted from each other. Then it will prove their assumption.

      We agree with the reviewer that in theory the choice of periodicity could introduce a bias. This is why we have chosen a larger step size than in our initial work, corresponding to ~3x the major periodicity of ~8.5 nm observed in the power spectrum of the sub-volumes, as mentioned above. Regardless, following the reviewer’s suggestion, we have now explored other types of periodicities by re-analyzing the dataset through extraction of non-overlapping sub-volumes along the proximal-distal centriole axis. In doing so, we randomized the starting position of the first box between tomograms, reaching the same goal as with random picking but maximizing the number of sub-volumes. We carried out this analysis for all T. spp., T. agilis and T. mirabilis cartwheel classes, and found no notable differences that would affect the conclusions of the manuscript compared to the initial overlapping sub-volume classification, albeit generally with a noisier STA due to the lower number of sub-volumes. A comparison of the two approaches is provided in Rebuttal Figure 2. Moreover, all the points regarding the choice of periodicity have been further clarified in the expanded Materials and Methods section (pages 19-21).

      Minor points:

      They discussed difference of stacked SAS-6 rings in the cartwheel from various species. How much is the sequence difference of SAS-6 among these species?

      Unfortunately, no genomic or transcriptomic data has been published for the species investigated here, although the sparse molecular data available from small subunit rRNA sequences allows one to establish an overall molecular phylogeny. We previously identified a SAS-6 homologue in T. agilis (Guichard et al. 2013), which shares 20 % identity and 45 % similarity with C. reinhardtii SAS-6. Despite low sequence conservation, the structural conservation of SAS-6 is predicted to be high between the two organisms (Guichard et al. 2013). We apologize if these points were not expressed sufficiently clearly in the initial rendition and have adapted the wording in the revised manuscript (lines 325-332).

      Are the authors sure that CID is nine-fold symmetric? It is not trivial.

      We thank the reviewer for bringing up this interesting point. We have applied 9-fold symmetrization to the entire central cartwheel comprising spokes, hub and CID/ fCID, a choice guided by the apparent 9-fold symmetry of the spokes and peripheral element. We investigated the impact of symmetrization on the CID by relaxing symmetry from C9 to C1 during refinement, but did not observe a difference, and thus continued with C9 symmetry, which improves map resolution by S/N ratio enhancement and additional missing wedge compensation. In addition, we have also analyzed the CID without symmetrization, as reported in Figure S7 (previously: Fig. S6). Note that these maps were generated with larger sub-volumes centered on the spokes to comprise hub, spokes and microtubule triplets, explaining the resulting lower resolution, as the missing wedge is not compensated. Despite these limitations, however, the unsymmetrized CID shown in Figure S7A and S7E resembles the one in the symmetrized maps of Figure 2, indicating that the CID indeed exhibits 9-fold radial symmetry. That this is the case is spelled out explicitly in the revised manuscript (lines 1145-1147).

      Fig.1C: Another cross-section from the distal region will be helpful. A longer scale bar is better for readers' understanding.

      We understand that the reviewer is curious about the distal region, and cross-section views of resin-embedded sections from T. agilis are available and could be provided if necessary. However, given that the focus of the manuscript is strictly on the cartwheel-bearing proximal region, we felt that featuring the distal region in detail would break the narrative. Therefore, we suggest to keep Figure 1 as in the original manuscript. Following the reviewer’s suggestion, we increased the size of the scale bars from 10 nm to 20 nm in Figure 1C as well as in the corresponding Figure S8C.

      Fig.S6F: It would be informative if the subclasses (25% and 20%) are distinguished in this mapping.

      As per the reviewer’s request, we provide in Rebuttal Figure 3 a side-by-side comparison of the T. agilis 25 % and 20 % classes centered on the spokes, which are noisier than the composite 45 % class due to the lower number of sub-volumes in each sub-class. Given that there are no notable differences between the two maps that would affect any of the conclusions of the manuscript, we feel it is best to keep what is now Figure S7F (previously: Fig. S6F) unchanged in the revised manuscript.

      A figure to explain the classification scheme will help readers understand. How many subtomograms did classification started? Were the 45% class classified into two (25% and 20%) groups by two-step classification or at once (the entire subtomograms were classified into three groups directly?

      We thank the reviewer for this useful suggestion. As a result, we have generated a new Supplemental Figure S1G-J that provides a graphical overview of the classification scheme, together with sub-volume numbers for all deposited maps, thus nicely complementing Table S1.

      Reviewer #1 (Significance (Required)):

      Nevertheless, this work demonstrated capability of cellular cryo-ET, especially analysis of structural heterogeneity. Thus, while biological topics handled are rather specialized for cilia from flagellate, this work will attract attention of any biologist interested in molecular structure in vivo. It is worth for publication in a high journal after addressing the points above. This reviewer believes that the authors can address these points easily with additional analysis.

      We reiterate our thanks to this reviewer for her/his favorable evaluation and detailed suggestions, which enabled us to generate a strengthened manuscript.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      Here, Nazarov and colleagues report sub-tomogram average (STA) maps of centrioles with 16 to 40 Å resolution from Trichonympha spp., Trichonympha agilis, and Teranympha mirabilis. Even though the authors have previously described the centriole architecture of T. spp, these STA maps of higher resolution revealed new features of centrioles, like polarized Cartwheel Inner Density (CID) and the pinhead. They also observed Filament-like structure (FLS) from T. mirabilis which seems to correspond to the CID from other species. Interestingly, they suggest that one and two SASS6 rings are stacked in an alternative fashion to make the central hub in T. mirabilis (Figure 5). The following issue should be addressed:

      Major points

      • Figure 4E. Authors mentioned in the manuscript that "We observed that every other double hub units in the 36% T. mirabilis class appears to exhibit a slight tilt angle relative to the vertical axis". When I see the other side, it does not seem to be tilted. Could the authors explain this?*

      We apologize that this aspect was not explained in sufficient detail. The left and right sides of the hub indeed appeared different in transverse views across the cartwheel center (previous Fig. 4E). This was because the area we selected in the original submission was centered on one emanating spoke. Due to the 9-fold symmetry one spoke density was selected on the right side, while the region between two spokes was displayed on the left side (as was illustrated by the slice across the center in previous Figure 4A; dashed rectangles in 4.0 nm panel). We have now selected a larger area to include spokes from both sides of the hub and thus better visualize this offset as shown in the modified Figure 4D-E.

      Reviewer #2 (Significance (Required)):

      I believe these results are of interest for all centrosome researchers and would like to recommend this manuscript be published in the EMBO journal which is affiliated with the Review Commons.

      We thank the reviewer for the recommendation to submit the revised manuscript to EMBO Journal, which we have followed.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      In this manuscript Nazrov et al., use cryo-electron tomography (CET) to analyse the structure of the centriole cartwheel. The Gonczy lab have previously generated a ground-breaking structure of the cartwheel from Trichonympha spp (T. spp.) (Guichard et al., Science, 2012; Guichard et al., Curr. Biol., 2013). This work is a direct continuation of those studies but using modern technology to get higher resolution images of the T. spp. cartwheel and comparing this to the cartwheel from Trichonympha agilis and from another distantly related flagellate Teranympha mirabilis.

      The data is generally well presented and of high quality. I am not an expert in CET, so it would be advisable to get the opinion from a reviewer who is, but the Gonczy lab are experienced in these techniques so I would not anticipate any problems. I have to admit that the title of the paper did not excite me, and I expected this to be a very worthy, but incremental study. It was a pleasure to find out that the extra detail provided by the increased resolution has revealed several new and unexpected features that have important implications for our understanding of cartwheel assembly and function. Most important are the potential asymmetry of the cartwheel hub, apparent variations in the packing mechanism of the stacked rings (even within the same cartwheel), and the potential offsetting of ring stacking. These findings will be of great interest to the field, and so I am strongly supportive of publication in The EMBO Journal. I have only a few points that I think the authors should consider.

      We thank the reviewer for this positive feedback and the recommendation to submit to EMBO Journal, which we hereby follow.

      Prompted by the comment of the reviewer, we revised the title to make it more informative and appealing to readers: “Novel features of centriole polarity and cartwheel stacking revealed by cryo-tomography”.

      • Nazarov et al., conclude that the cartwheel structure is intrinsically asymmetric. This is most convincingly based on the displacement of the CID within the hub, but they state that the Discussion that the potential offset between the Sas-6 double rings generates an inherently polar structure. I didn't understand why this is the case. Looking at Fig.S9A,B I can see that the offset in B could tilt to the left (as shown here) or to the right (if the structure was flipped by 180o). But I couldn't see how this makes this structure polar in the sense that a molecule coming into dock with the structure could only bind to one side of the offset structure shown in B, but to both sides of the aligned structure shown in A. I think this needs to be explained better, as it is crucial to understand where any potential polarity in the cartwheel structure comes from.*

      We apologize for not having been sufficiently clear about how two SAS-6 rings with an offset could impart organelle polarity. The reviewer is correct that an offset between superimposed rings alone is not sufficient to generate polarity at a larger scale. The important point we would like to stress, however, is that we discovered concerted polarity in multiple locations, from the central hub to the peripheral elements as illustrated in Fig. S7C-D, S7G-H, S7K-L and S7O-P (previously: Fig. S6). Prompted by the reviewer’s comment, we now better emphasize the asymmetric tilt angles of merging spokes, as highlighted also in the improved Figure S7. This asymmetric spoke tilt angle allows one to discriminate the proximal and distal side of a double SAS-6 ring, which is now explained better in the text (lines 259-263 & 502-510).

      • Related to this last point, in a co-submitted paper Klena et al. do not report such an asymmetry in the hub structures they have solved from several different species (neither in the tilting of the hub, or the displacement of the CID). I think it would be worth both sets of authors commenting on this point.*

      We agree that comparing and contrasting the results of the two companion manuscripts is important and we have updated the text as a consequence in several places (lines 444, 467, 507, 536, 985, 1000). We know from our previous work (Guichard et al. 2013) that the asymmetry of the hub and spoke is not visible at lower resolution. In the accompanying manuscript by Klena et al., no offset in the hub or asymmetric CID localization is reported, probably due to lower resolution and differences between species.

      • The authors data strongly suggests that the T. ag. and Te. mir. hubs are composed of a mixture of single and double Sas-6 rings. In contrast, the T. spp. cartwheel only has a single class of rings, but it wasn't absolutely clear if the authors think this comprises a single or double ring. In the text it is presented as though the elongation of the hub densities in the vertical direction is a new feature of the T. ag cartwheel (Fig.2H,I), but to me it looks as though this is also apparent in the T. spp. cartwheel (Fig.2C,D). The authors should address this directly and, if they believe that T. spp. has a double ring, they should comment on whether this more regular structure seems to have offset rings. If not, then the offset rings are unlikely to be the source of asymmetry that leads to the asymmetric displacement of the CID. Finally, if the authors think these are double rings, they should also be clear that they would now slightly re-interpret their original T. spp. cartwheel model (Figure 2, Guichard et al., Curr. Biol.). There is no embarrassment in this-a higher resolution structure has simply revealed more detail.*

      We apologize if the conclusions drawn about T. spp. cartwheel hubs were not sufficiently clearly expressed. Like the reviewer, we think that elongated hub elements are also discernible in T. spp., something that is also illustrated by the intensity plot profile in Figure 2C (double peaks on light blue line). These points are spelled out more explicitly in the revised manuscript (lines 177-179). In addition, to emphasize the conservation of the double hub units in both Trichonympha species, we have likewise adapted the text for T. agilis (lines 198-201).

      As for the offset observed within T. spp. spoke densities in Figure S10H, we interpret this as evidence for an offset of the double ring at the level of the hub, although we have not observed such offset in T. spp. for reasons that are unclear. The fact that this revises our previous interpretation based on a lower resolution map of T. spp. was already mentioned in the initial submission but is now better emphasized (lines 171-172 & 179-181).

      • The authors conclude that T. mirabilis cartwheels lack a CID and instead have a filament-like structure (FLS). I wonder whether it is more likely that the FLS is really a highly derived CID that appears to be structurally distinct when analysed in this way, but that will ultimately have a similar molecular composition. This situation might be analogous to the central tube in C. elegans, which by EM appears to be distinct from the central cartwheel seen in most other species, but is of course still composed of Sas-6. This historical tube/cartwheel nomenclature is now cumbersome to deal with, so perhaps it would be better to be cautious and not give the T. mirabilis structure a completely new name-how about "unusual CID" (uCID).*

      We share the view that the CID and the “FLS” –the term used in the initial submission- may have a related molecular composition and function, as we had also speculated in the discussion of the original submission. Following the reviewer’s suggestion, and in an effort to have a more uniform nomenclature, we propose to dub the T. mirabilis structure “filamentous CID” (fCID). This highlights better the similar location of these two entities and their potential shared function, while stressing the filamentous nature of the fCID. We further emphasize this point by providing the new Figure 6A to compare the presence of the two entities in select species. The discussion has also been adapted accordingly (pages 13-14).

      Rebuttal Figure Legends

      Rebuttal Figure 1: Re-classification of major classes

      (A-D) Transverse (top) and longitudinal (bottom) views of T. agilis (A, B) and T. mirabilis (C, D) central cartwheel 3D maps. The final major classes reported in the manuscript (A: 55 % class, C: 64 % class) were subjected to re-classification, which again yielded one major class in each case, with no notable improvement (B, D).

      Rebuttal Figure 2: Reclassification with non-overlapping sub-volumes

      (A-F) Transverse (top) and longitudinal (bottom) views of T. spp. (A, B) T. agilis (C, D) and T. mirabilis (E, F) central cartwheel 3D maps. The final maps reported in the manuscript (A, C, E) were generated with a 25 nm step size, yielding overlapping sub-volumes, whereas the maps in (B, D, F) were generated from non-overlapping sub-volumes, with no notable differences between the two that would affect the conclusions of the manuscript.

      Rebuttal Figure 3: Polar centriolar cartwheel upon sub-classification

      (A-C) 3D transverse views of non-symmetrized STA centered on the spokes to jointly show the central cartwheel and peripheral elements in the T. agilis 45 % class (A), as well as separately in the 25 % class (B) and 20% class (C). No notable differences are apparent following such re-classification, apart from the output being noisier due to the lower number of sub-volumes in each sub-class.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      In this manuscript Nazrov et al., use cryo-electron tomography (CET) to analyse the structure of the centriole cartwheel. The Gonczy lab have previously generated a ground-breaking structure of the cartwheel from Trichonympha spp (T. spp.) (Guichard et al., Science, 2012; Guichard et al., Curr. Biol., 2013). This work is a direct continuation of those studies but using modern technology to get higher resolution images of the T. spp. cartwheel, and comparing this to the cartwheel from Triconympha agilis and from another distantly related flagellate Tetranympha mirabilis.

      The data is generally well presented and of high quality. I am not an expert in CET, so it would be advisable to get the opinion from a reviewer who is, but the Gonczy lab are experienced in these techniques so I would not anticipate any problems. I have to admit that the title of the paper did not excite me, and I expected this to be a very worthy, but incremental study. It was a pleasure to find out that the extra detail provided by the increased resolution has revealed several new and unexpected features that have important implications for our understanding of cartwheel assembly and function. Most important are the potential asymmetry of the cartwheel hub, apparent variations in the packing mechanism of the stacked rings (even within the same cartwheel), and the potential offsetting of ring stacking. These findings will be of great interest to the field, and so I am strongly supportive of publication in The EMBO Journal. I have only a few points that I think the authors should consider.

      1. Nazarov et al., conclude that the cartwheel structure is intrinsically asymmetric. This is most convincingly based on the displacement of the CID within the hub, but they state that the Discussion that the potential offset between the Sas-6 double rings generates an inherently polar structure. I didn't understand why this is the case. Looking at Fig.S9A,B I can see that the offset in B could tilt to the left (as shown here) or to the right (if the structure was flipped by 180o). But I couldn't see how this makes this structure polar in the sense that a molecule coming into dock with the structure could only bind to one side of the offset structure shown in B, but to both sides of the aligned structure shown in A. I think this needs to be explained better, as it is crucial to understand where any potential polarity in the cartwheel structure comes from.

      2. Related to this last point, in a co-submitted paper Klena et al. do not report such an asymmetry in the hub structures they have solved from several different species (neither in the tilting of the hub, or the displacement of the CID). I think it would be worth both sets of authors commenting on this point.

      3. The authors data strongly suggests that the T. agg. and Te. mir. hubs are composed of a mixture of single and double Sas-6 rings. In contrast, the T. spp. cartwheel only has a single class of rings, but it wasn't absolutely clear if the authors think this comprises a single or double ring. In the text it is presented as though the elongation of the hub densities in the vertical direction is a new feature of the T. agg cartwheel (Fig.2H,I), but to me it looks as though this is also apparent in the T. spp. cartwheel (Fig.2C,D). The authors should address this directly and, if they believe that T. spp. has a double ring, they should comment on whether this more regular structure seems to have offset rings. If not, then the offset rings are unlikely to be the source of asymmetry that leads to the asymmetric displacement of the CID. Finally, if the authors think these are double rings, they should also be clear that they would now slightly re-interpret their original T. spp. cartwheel model (Figure 2, Guichard et al., Curr. Biol.). There is no embarrassment in this-a higher resolution structure has simply revealed more detail.

      4. The authors conclude that T. mirabilis cartwheels lack a CID and instead have a filament-like structure (FLS). I wonder whether it is more likely that the FLS is really a highly derived CID that appears to be structurally distinct when analysed in this way, but that will ultimately have a similar molecular composition. This situation might be analogous to the central tube in C. elegans, which by EM appears to be distinct from the central cartwheel seen in most other species, but is of course still composed of Sas-6. This historical tube/cartwheel nomenclature is now cumbersome to deal with, so perhaps it would be better to be cautious and not give the T. mirabilis structure a completely new name-how about "unusual CID" (uCID).

      Significance

      see above

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Here, Nazarov and colleagues report sub-tomogram average (STA) maps of centrioles with 16 to 40 Å resolution from Trichonympha spp., Trichonympha agilis, and Teranympha mirabilis. Even though the authors have previously described the centriole architecture of T. spp, these STA maps of higher resolution revealed new features of centrioles, like polarized Cartwheel Inner Density (CID) and the pinhead. They also observed Filament-like structure (FLS) from T. mirabilis which seems to correspond to the CID from other species. Interestingly, they suggest that one and two SASS6 rings are stacked in an alternative fashion to make the central hub in T. mmirabilis (Figure 5). The following issue should be addressed:

      Major points

      1. Figure 4E. Authors mentioned in the manuscript that "We observed that every other double hub units in the 36% T. mirabilis class appears to exhibit a slight tilt angle relative to the vertical axis". When I see the other side, it does not seem to be tilted. Could the authors explain this?

      Minor Points

      1. Page 11, I think Fig. 9G indicates Fig. S9G.

      Significance

      I believe these results are of interest for all centrosome researchers, and would like to recommend this manuscript be published in the EMBO journal which is affiliated with the Review Commons.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Centriole structure has been an attractive but challenging research topic for years. Pierre Gonczy's group has been working on its structure using cryo-electron tomography (cryo-ET). While the axoneme, which has longitudinal periodicity, was analyzed by several groups by cryo-ET for more than a decade, cryo-ET study on the centriole suffers from poor signal to noise ratio due to its limited length and thus fewer periodicity. They chose the centriole of flagellate Trichonympha, which have exceptionally long centrioles and thus offer opportunity of relatively straightforward subtomogram averaging. Their approach has been successful and they revealed intermediate resolution structure of the cartwheel, key of 9-fold symmetry formation, and it's joint to triplet microtubules (Guichard et al. 2012, 2013, 2018). In this work, they employed modern state-of-art cryo-ET technique, such as direct electron detection and 3D image classification to upgrade our knowledge of centriole structure. In their past works, the central hub of the cartwheel, made of SAS-6 protein forming 9-fold complex, was described as an 8nm periodic object. With improved spatial resolution, they provided further detail with clear polarity, which will deepen our thought about the initial stage of ciliogenesis. They also compared two Trichonympha species (spp and agilis) as well as another flagellate, Teranympha micabilis, and extended their intriguing evolutional and mechanical hypotheses based on structural differences. Despite improved spatial resolution, it is still not possible to identify proteins in the cryo-ET map (cellular cryo-ET will not reach such high resolution in the near future). Therefore this work is rather geometrically descriptive, which will inspire molecular biologists to identify molecules by other methods. Nevertheless this work demonstrated capability of cellular cryo-ET, especially analysis of structural heterogeneity. Thus, while biological topics handled are rather specialized for cilia from flagellate, this work will attract attention of any biologist interested in molecular structure in vivo. It is worth for publication in a high Journal after addressing the points below. This reviewer believes that the authors can address these points easily with additional analysis.

      Major points:

      1. Entire scheme A graphic diagram of the entire cartwheel area, summarizing this work, is necessary for the readers' understanding (similar to Fig.6 of the other manuscript, Klena et al.). Then average scheme should be shown in more detail, especially assumption of periodicity, Materials and Methods. The cartwheel hub was averaged with 25nm periodicity (as discussed below). Was the pinhead averaged with 16nm (as detected by FFT in Fig.S2L)? How about the triplet? This reviewer is not completely sure if the longitudinal averaging strategy is justifiable. Since periodicity of each domain is not trivial, logically the initial average must be done with the size of least common multiple (or larger). It is likely 96nm, assuming 25nm of the central hub is 3 times of microtubule periodicity and 16nm of the pinhead is twice of MT. 96nm average should be possible with a long cartwheel in this work. Alternative, in case periodicity is independent of MT and thus there is no least common multiple, is random picking and classification mentioned in "4. Periodicity". This should also be possible, since they can pick enough number of particles from long cartwheels.

      2. Classification The authors analyzed structural heterogeneity inside the cartwheel hub, employing reference-free classification by Relion software. The program reveals multiple coexisting structures - two from Trichonympha agilis and three from Teranympha, respectively. Whereas this is an exciting finding and shows future research direction of this field, interpretation of this classification must be done carefully. It is puzzling that major (55%) population of T. agilis shows more ambiguous features than the minor population (45%), while spatial resolutions by FSC are not so different - for example, Fig.2H vs Fig.S5C. In case of Teranympha, it is even more drastic - Fig.4D (major class) seems blurred along the centriolar axis, compared to Fig. 4E (minor class). This reviewer is afraid that these "major" classes might contain more than one structure and after subaveraging be blurred in detailed features. The apparent good spatial resolution could be explained, when two structures coexist and subtomograms are aligned within each subclass. Probably lower resolution at the spoke region of the major class (Fig.S2A) than that of the minor class (Fig.S2D) is a sign of heterogeneity within this class. Another risk could be subtomograms with poorer S/N being categorized to one class (due to lack of feature to be properly classified). Fig.S5F (black dots localized in one tomogram) raised this concern. The following investigation will help to solve this issue. 1. Extract and re-classify subtomograms belonging to the major population. 2. Direct observation of tomograms. The authors could plot two classes of Teranympha (as they did for T. agilis in Fig.S5) and find features of the cylindrical cartwheel hub in two conformations (as shown Fig.4DE). Since such a feature was directly observed in tomograms from the other manuscript (left panels of Fig.S6AC in Klena et al.), it should be possible in this work as well.

      3. Periodicity mismatch In Fig. 2CD, periodicity of CID has discrepancy from that of the stacked SAS-6 ring (8.5nm and 8.0nm). Do the authors think this is a significant difference or within an error? The same question can occur to other subtomogram averages. It would be nice to show errors as shown in their other manuscript (Fig.3C of Klena et al.) and clarify their idea. If it is systematic difference of periodicity between the stacked ring and CID, this shift will be accumulated through the entire cartwheel region - after 100nm, 8.5nm/8.0nm difference can be accumulated to ~6nm, which should change the entire view of the subtomogram - and the main factor to be classified (periodicity mismatch). This artifact (or influence) should be removed (or separately evaluated) by masking CID (out and in) and run classification separately. By clarifying this, the quality of the major subaverages (mentioned in the previous paragraph) could be improved.

      4. Periodicity They averaged subtomograms extracted with spacing of 252A with initial average as the first template (p.18 Line22). This means they assumed 25nm periodicity from the beginning and excluded different or larger unit size (if they take search range wide, they could detect difference periodicity, but will still be biased by initially assumed 25nm). 25nm average allowed them to see more detail than before (when they assumed 8nm periodicity), but there is still a risk of bias from references. To avoid this risk, this reviewer would propose classification of randomly extracted (but of course along the cylindrical hub or along the triplet microtubules, so one-dimensionally random picking) subtomograms. This experiment will end up with multiple subaverages, which are 25nm (or multiple times of that) shifted from each other. Then it will prove their assumption.

      Minor points: They discussed difference of stacked SAS-6 rings in the cartwheel from various species. How much is the sequence difference of SAS-6 among these species? Are the authors sure that CID is nine-fold symmetric? It is not trivial. p.7 Line21 "Fig.S1D-O": D-L p.8 Line1: It would be nice if more detailed description about MIPs, correlating to recent high resolution works from Bui and Brown labs. p.9 Line6 "Focused 3D classification...": This sentence is unclear. p.18 5 lines from bottom "S6C, S6F": How can these panels be power spectra to measure spacing? Typo? Fig.1C: Another cross-section from the distal region will be helpful. A longer scale bar is better for readers' understanding. p.29 Line6: pin -> pink Fig.S6F: It would be informative if the subclasses (25% and 20%) are distinguished in this mapping. A figure to explain the classification scheme will help readers understand. How many subtomograms did classification started? Were the 45% class classified into two (25% and 20%) groups by two-step classification or at once (the entire subtomograms were classified into three groups directly?

      Significance

      Nevertheless this work demonstrated capability of cellular cryo-ET, especially analysis of structural heterogeneity. Thus, while biological topics handled are rather specialized for cilia from flagellate, this work will attract attention of any biologist interested in molecular structure in vivo. It is worth for publication in a high journal after addressing the points above. This reviewer believes that the authors can address these points easily with additional analysis.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Note from the authors (AU): This manuscript has been reviewed by subject experts for Review Commons. The authors would like to thank the reviewers for their comments to the manuscript, and the editor for patience with our response. Our reponse was delayed due to the COVID-19 lock-down situation in our institution. Now we are pleased to provide the following point-by-point response, as detailed below.

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      The manuscript by Suomalainen et al. describes a fluorescence-based approach combined with high-resolution confocal microscopy to study the heterogeneity of adenovirus infection in a population of human cells. The main focus of the authors is the detection of viral transcripts in infected cells, how this correlates with viral genomes, the cell state, and how it varies between different cells in a single population. The paper is generally well written and easy to read, with a few typos, although I found parts of it to be somewhat length and repetitive. Particularly the results section could be pruned somewhat for readability and clarity. The major limitation of the study as it stands is it's overall impact and novelty, which limits journal selection somewhat. A very similar study was recently published, which the authors cite (Krzywkowski et al, 2017). Nevertheless, I think the study design is rigorous and well executed, but I do have some specific comments which may enhance it's overall impact and novelty.

      **Major:**

      Results "Visualization of AdV-C5..." section:

      Why not also look at normal cells that can be synchronized? Cancer cells, such as A549 will by definition be highly heterogenous and at all phases of the cell cycle. Primary non-transformed cells can easily be synchronized by contact inhibition and are much more physiologically relevant.

      AU: In the current manuscript, we concentrated on the early phases of the AdV-C5 infection, on the question how virus gene expression is initiated and whether the cell cycle phase of the host cell impacts the initiation of virus gene expression. Answering these questions requires use of cells that express good amount of virus receptors so that viruses efficiently bind to the cells and infections can be synchronized so that extended time does not elapse between virus addition and accumulation of E1A transcripts; extended time between these two steps would make interpretation of the results more complex since cells could have progressed from one cell cycle stage to another during the experiment. Furthermore, having cells at all phases of the cell cycle is actually a benefit since then the experiment can be carried out under an “unperturbed” condition; all cell cycle synchronization methods have pleiotropic effects on the cells.

      It is true that primary non-transformed cells are physiologically more relevant than cancer cells, but primary cells have issues with donor-to-donor variability and many primary cells express rather low amounts of AdV-C5 receptors, so synchronized infections in these cells are not possible. Furthermore, the extended cell morphology of many normal fibroblast cell lines and the tendency of cell extensions from neighboring cells to overlap makes fluorescent images of these cells incompatible for automated cell segmentation.

      Here, we provide data also from HDF-TERT cells (nontransformed human diploid fibroblasts immortalized by human telomerase expression) to show that two of our key findings from A549 cells are not artefacts of cancer cells. This is, that akin to A549 cells, the infected HDF-TERT cells accumulate high number of E1A transcripts (Fig.1C), and also in these cells nuclear vDNA numbers do not predict the cytoplasmic E1A transcript counts during early phases of infection (S2C Fig). However, since HDF-TERT cells are rather inefficiently infected by AdV-C5, correlation of early E1A transcript accumulation to the cell cycle phase of the host cell could not been done in these cells. We have been unable to identify primary or normal immortalized cells that would be easily available and efficiently infected by AdV-C5 (synchronized infection with short time elapsed between virus addition and accumulation of E1A transcripts).

      "The virus particles bound..." - Can the spatial resolution of a confocal microscope truly differentiate individual particles that are sub-wavelength in size? What about the sensitivity for single particles? Some sort of experiment to show that single particles can be detected should be performed and shown to assure the readers that this is in fact possible. Furthermore, even when based on the particle to pfu ratio, the MOI would still be nearly 2000pfu/cell, so the actual number of observed particles is an order of magnitude lower than what was applied to the cells.

      AU: The fluorescence signal from individual fluorophore-tagged AdV or anti-hexon antibody-decorated particle is bright enough to be picked up by PMT or HyD detectors of the current confocal laser scanning microscopes. In fact, tracking fluorophore-tagged particles of the size of AdV has been a standard microscopy procedure since late 1990’s.

      Because the Reviewers were questioning the apparently high multiplicity of infection used in the experiments, we clarify the difference between “standard” MOI estimations and our infection set-up. First of all, as described in Material and Methods, we estimated the number of physical virus particles in our virus preparations using A260 measurements (J.A. Sweeney et al., Virol. 2002, doi: 10.1006/viro.2002.1406). This method, like all other methods used to estimate virus particle numbers, is likely not 100% reliable.

      Second, we incubated the virus inoculum with cells only for 60 min, after which the unbound viruses were washed away. During this short incubation time only a small fraction of input virus particles bind to cells, and indeed as shown in Fig.1A, a theoretical MOI of 54400 physical virus particles/cell or 13600 physical virus particles/cell yielded Median of 75 and 26 bound virus particles per cell, respectively. Interpretation of the results from the cell cycle assays required that there was a relatively short time between infection and analysis so that cells in a large scale did not change their cell cycle status during the experiment. This required use of a rather high MOI. Furthermore, for collection of a large data set, it is convenient that every cell is infected.

      Third, what exactly does one pfu mean in terms of physical adenovirus particles? There is no clear answer to this, since several parameters affect the pfu. In which cells was the titration carried out? How long was the input virus inoculum incubated with the cells? How many of the virus particles entering the cell actually established an infection? And, as described in A. Yakimovich et al. (J. Virol. 2012, DOI: 10.1128/JVI.01102-12), only a fraction of infected cells produce a plaque. The majority of papers stating that x pfu/cell was used for infection, usually incubate the cells with the virus inoculum for several hours at 37°C, and never make any attempts to estimate exactly how many virus particles entered into the cells.

      Fig. 4 - I am not certain that the observed difference is significant, at least looking at it, beyond the width difference of the peaks, highest expression for both is largely in G1. It would be nice to see this using a western blot of cell cycle sorted cells, which can easily be accomplished using FACS.

      AU: In the highest GFP expression bin, CMV-eGFP expressing cells have 43% cells in G1 and 50% in S/G2/M. In comparison, E1A-GFP expressing cells have 58% cells in G1 and 35% in S/G2/M. The difference in G1 cells in the highest eGFP bin is statistically significant (p Page 15, 2nd paragraph. It would be valuable and informative to determine whether there is heterogeneity in histone association with these different vDNAs and whether these histones exhibit divergent modifications (enabling or restricting transcription). Same as above. I am rather surprised that the DBP signal did not correlate well with vDNA signal, particularly for the larger replication centers. How can this be reconciled? Was there an increase in overall vDNA signal later in infection? It is important to know this as it determines whether the observed vDNA signal is real or could be caused by viral RNA or other background causes (non-infected controls notwithstanding). Can the signal be detected with inactivated viruses (via UV for example?)

      AU: Whether histone modifications impact the transcriptional output of adenovirus genomes early in infection is indeed an intriguing question, but unfortunately this is very challenging, if not impossible, to study at single-cell / single vDNA level with the existing technology. Techniques for single-cell measurements of chromatin states are still in infancy, although some notable advancements in this field were reported in 2019 (e.g. K. Grosselin et al. Nature Genetics, DOI: https://doi.org/10.1038/s41588-019-0424-9 and S. Ai et al. Nature Cell Biology, DOI: https://doi.org/10.1038/s41556-019-0383-5).

      Furthermore, current literature offers a confused picture as to when exactly protein VII on incoming virus genomes is replaced by histones (reviewed in the reference 39, Giberson et al.). Of note, the vast majority of incoming nuclear vDNA molecules scored protein VII-positive with anti-VII staining under the experimental conditions used for the Fig. 2C data. However, we did not include these results into the manuscript because VII-positive signal on vDNAs does not exclude these vDNAs having histones on certain parts of the genome.

      The Reviewer wonders why the DBP signal in Fig.6C does not correlate with vDNA signal. There is no discrepancy here because DBP signal in the figure is a proxy for replicating vDNA whereas the click vDNA signal reports incoming vDNA. The one DBP spot without an associated click vDNA signal could be due to a replication center originated from a replicated viral genome, not from incoming viral genome. The figure shows that incoming vDNAs within the same nucleus initiate replication asynchronously.

      Page 18, 1st paragraph. It would be interesting to determine whether there was association between pol II and those genomes that showed no E1A, similarly to the histone suggestion. What about things like viral chromatin organization? Soriano et al. 2019 showed how E1A and E4orf3 work in tandem to alter viral chromatin organization by varying histone loading on the viral genome.

      AU: This again would be technically very challenging to show. We actually tried to visualize active transcription using an antibody against RNA polymerase II CTD repeat YSPTSPS (phosphor S5), azide-alexa fluor488 and anti-alexa fluor488 antibody to mark EdC-labeled incoming vDNAs and proximity ligation assay for signal amplification. However, this method was not sensitive enough to detect RNA polymerase II association with individual viral genomes. We only detected the proximity ligation signal in replication centers when replicated viral genomes were tagged with EdC.

      Fig. 2. Can you really say that a single dot correlates with a single transcript? Has that been validated in any way?

      AU: Signal amplification with branched DNA technology leads to binding of a large number of fluorescent probes to a mRNA and thus enables detection of single nucleic acid molecules. This has been validated e.g. in A.N. Player et al. 2001. J. Histochem. Cytochem (https://doi.org/10.1177/002215540104900507) and N. Battich et al. 2013. Nature Methods (https://doi.org/10.1038/nmeth.2657).

      **Minor:**

      Page 5, last paragraph. "Transcirpts from the viral late transcription unit,..." This is not correct as recently shown by Crisostomo et al, 2019.

      AU: The data in Crisostomo et al. paper suggest that some late gene expression can occur before vDNA replication, but an abundant accumulation of late transcripts coincides with onset of vDNA replication. However, the Crisostomo et al. study did not test what the levels of late gene transcripts are if the vDNA replication was inhibited. But to acknowledge the possibility that there might be some level of late gene transcription prior to replication of the viral genomes, the sentence is modified as follows: “Transcripts from the viral late transcription unit, amongst them mRNAs for the viral structural proteins, vastly increase in abundance concomitant with the onset of vDNA replication”. Furthermore, we have added the Crisostomo et al. reference here as well.

      Page 10, "... because AdvV-infected cells are less well adherent..." This is not strictly true as loss of attachment only occurs later on in infection. It would be helpful to have statistical significance indicated directly in the figures.

      AU: Although clearly visible cell rounding indeed occurs only late in infection, also during early stages of infection the HAdV-C5-infected cells are less adherent than non-infected cells. In many assays this is not obvious, but the RNA FISH staining procedure includes several incubation and washing steps in rather harsh buffers, and we observed random, sometimes considerable, cell loss with infected cultures but not with non-infected cultures.

      In the revised manuscript we have included the statistical significance P values both into the main text and the figure legends, but not to the figures directly, because the P values were generated with different statistical tests and P values should not be shown/mentioned without stating which statistical test was used. However, we noticed that we had in some cases omitted to mention what was the number of pairs analyzed in some of the Spearman’s correlation tests. This has now been corrected in the revised manuscript.

      The very high MOIs used are concerning, could these have negative effects on the cell viability or overall state?

      AU: We refer to our explanation above about the theoretical MOI and the actual MOI. Furthermore, in the experiment described in Fig.2C (correlation of E1A transcripts per cell vs. viral genomes per cell), 42% of analyzed cells had ≤ 5 viral genomes/cell and 27.5% of analyzed cells had between 6-10 viral genomes per cell; these are not high numbers. We also provide controls that the EdC-labeled genomes are detected with good efficiency. Hence the EdC-labeled genomes per cell are a good estimate of the numbers of virus particles that indeed entered into the cells.

      There are a few typos and such that should be corrected. AU: We have tried to find and correct the typos.

      Reviewer #1 (Significance (Required)):

      As I stated above, the work is interesting and significant, to a degree. The major limitation is that the novelty is low as a paper published in 2017 (cited by the authors) used a very similar approach to investigate a similar problem. In addition, there are multiple other recent papers looking at cell populations in the context of adenovirus infection, and whether a single cell or population based approach is better is unclear. This is something the authors might want to strengthen prior to submission.

      AU: In the current study, we focused on the early phase of HAdV-C5 infection, on how viral gene expression is initiated and how individual nuclear viral genomes proceed to a replicative phase. The Krzywkowski et al. 2017 J. Virol. Paper that the reviewer refers to used padlock probe-based rolling circle amplification technique to simultaneously detect HAdV-C5 genomes and viral mRNAs in individual infected cells.

      The shortcoming of this method is inferior sensitivity compared to the branched DNA technology-based method used by us in the current study. Krzywkowski et al. were able to pick up signals from virus mRNAs and virus genome only relatively late in the infection, i.e. at the time when incoming genomes were expected to have multiplied by replication. Thus the study by Krzywkowski et al. was unable to provide information for the questions addressed in our study, i.e. do the levels of E1A transcripts early in infection correlate with viral vDNA counts in the nucleus and is there variability in the transcription output from individual vDNAs within the same nucleus, or variability in how individual vDNAs within the same nucleus proceed into the replication phase. We hence do provide novel information, and do not consider this as a limitation of our paper.

      We emphasize that population assays are done to attempt to understand molecular basis of a phenomenon by correlations. Instead, deep molecular insights require to-the-point-assays, in the case of transcription, single-molecule live cell assays at the level of single genes. Technically, we (and also the field) are not quite there yet.

      Regardless, our study is a first step towards understanding transcription output of nuclear HAdV-genome at single-cell, single-genome levels. It has revealed insight that was not apparent from population assays. It is clear that the next step will be time-resolved live cell assays with simultaneous detection of transcription output, genome detection and transcription factor clustering on the genomic loci. With current technology the simultaneous detection of all these events is challenging, and requires the development of further technology.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      The authors show heterogeneity of AdV-C5 mRNA transcript quantity and dynamics in different cell types, which is regulated by the cell cycle phase and does not correlate to incoming viral DNA, using single molecule RNA FISH technologies and detection of incoming viral DNA by EdC labeling.

      **Major Comments:**

      The authors change the MOI used in their experiments (7 different MOIs are used throughout the paper) in a manner that appears randomly and without explanation. (54400 for Figure 1A, 1B, 3B, S3B; 37500 for Figure 1C; 23440 for Figure 2A, 2C, S5A; 13600 for Figure 1A, 1D; 36250 for Figure 3C, S3D; 11200 for Figure 4B; 23400 for Figure 6B). The authors should provide explanation, why these changes in MOIs are necessary.

      AU: The MOIs given are theoretical MOIs, and essentially all figures indicate what was the actual MOI, that is, the real number of virus particles entering into the cells. This is beyond what is commonly provided in virology. It is essential, however, since MOI differs between different cell types. Therefore, we prefer to use the actual MOI as shown in Fig.1A, or we indicate the number of vDNAs that were delivered to the cells of interest.

      Variable MOIs had to be used to ensure that different cell lines received comparable numbers of virions, in particular virus particle binding to and entering into the cells. Infection kinetics are different in different types of cells, but can be tuned by MOIs used. Furthermore, different virus preparations were used in the experiments and we performed analyses at different stages of the infection cycle. Due to all these different facettes provided by our experiments, it was impossible to choose one standard (theoretical) MOI for all the experiments.

      The authors use mean fluorescence intensity of E1A probes per cell as estimate for viral transcript abundance for some of their experiments (Figure 1D, E, 3B), and count E1A punctae as measure for E1A transcripts in other experiments (Figure 2C, 3C, 5), without showing data, that these measures correlate. Problematic is hereby, that not all E1A punctae have the same signal intensity, as can be seen in Figure S1, which makes the estimation of the correlation of E1A punctae (= number of transcripts) and fluorescence intensity difficult. The authors should provide both (E1A punctae counts and estimation via fluorescence intensity) for at least one experiment, to prove, that the estimation of E1A transcript levels via fluorescence intensity is feasible.

      AU: The quantification method had to be adjusted to the number of virus transcripts in the cell at the time of analysis. The best quantification method is segmentation and counting the individual fluorescent puncta per cell, but, as stated in the manuscript, this method does not accurately quantify the mRNA puncta from maximum projections of confocal or widefield image stacks when the number of puncta per cell exceeds ~ 200.

      On the other hand, as shown in the quantification below, mean fluorescence intensity measurements per cell do not of course distinguish between cells having one vs. two mRNA puncta. Yet, as shown in the figure below, a relatively good correlation between puncta counting and fluorescence intensity measurements is achieved when cells have ≥ 10 transcripts per cell. Subsets of randomly picked images of the Fig.2C/Fig.5 dataset were included into the analysis (rs is Spearman’s correlation rank coefficient, approximate P p.15: "The nuclear E1A signals in AraC-treated cells were resistant to RNase A, but they were dampened by treatment with S1 nuclease (S6B Fig)." The authors make this statement based on (i) two completely different timepoints (12 h.p.i. for RNaseA treatment, 24.5 h.p.i. for S1 nuclease treatment) and (ii) in different clones of the A549 cells as stated in the methods section on p.21 (Two different clones of human lung epithelial carcinoma A549 cells were used in the study: our laboratory's old A549 clone (experiments shown in Fig. 1, Fig. 3B and S1 Fig., S3B and S3C Fig., S6A and S6B Fig., RNase A treatment) and A549 from American Type Culture Collection (ATCC, experiments shown in Fig. 2 and Fig. 5, Fig. 6, S2B Fig., S4 Fig., S5 Fig., and S6B Fig. S1 nuclease-treatment)). This makes it difficult to interpret, if the data is due to differences in the timepoints or cell types, or if it is due to binding of the E1A probe to single stranded vDNA.

      AU: This is a fair criticism, thank you. We have replaced the RNase A figure S6B in the revised manuscript. A new RNase A experiment was repeated in ATCC A549 cells using the same infections conditions as with the S1 nuclease-treated cells.

      **Minor Comments:**

      p.4: "AdV are non-enveloped, double-stranded DNA viruses that cause mild respiratory infections in immuno-competent hosts, and establish persistent infections, which can develop into life-threatening infections if the host becomes immuno-compromised [reviewed in 6]." Not all AdV cause respiratory diseases, the disease outcome of human AdV depends on the site of primary infection, which differs between the different AdV types.

      AU: We have modified the text as follows: AdV are non-enveloped, double-stranded DNA viruses that cause mild respiratory, gastrointestinal or ocular infections…

      p.7: The authors state, that "At the 17 h time point, about half of the cells had high numbers of protein VI transcripts, and most of them very high numbers of E1A transcripts.", however, the picture shown in Figure 1F shows a different phenotype, with low transcript levels of VI in E1A high cells and high transcript levels of VI in E1A low cells.

      AU: This was perhaps a bit difficult to see in the overlay images since one has to distinguish between green and yellowish green. We have provided the individual channels along the overlay picture in Fig. S1D, and now it is clear that at 17h pi cells with high numbers of VI transcripts have also high numbers of E1A transcripts.

      p.8: "This nuclear E1A signal is due to binding of the E1A probe to single-stranded vDNA in the replication centers (see below)." The authors should state here, that due to the binding of the probes to the single stranded vDNA in the replication centers, the nucleus was excluded from the analysis for Figure 1F in late timepoints.

      AU: We have modified the text according to the Reviewer’s suggestion. The text is now as follows: ‘Due to further studies (see below), we assume that this nuclear E1A signal represents binding of the E1A probe to single-stranded vDNA in the replication centers. Accordingly, the nuclear area was excluded when quantifying the viral transcripts per cell in late timepoints (Fig. 1F).’

      Due to this time point the author cannot state that the E1A staining seen (Fig. 1F; indicated with white arrows) are replication centers; this is just an assumption, since there is no evidence in Fig 1 the author cannot be sure; the author should change the text: "taking the following experiments into account...", "due to further studies (see below)..... we assume that..."

      AU: We have modified the text according to the Reviewer’s suggestion; see also the previous comment above.

      p.8: The authors should mention the figure they refer to, since there is no E1B-55K staining in Fig. 1F

      AU: The text has been modified as follows: Whereas other time points showed relatively few E1A, E1B-55K or VI puncta over the nuclear area (Fig. 1B, 1F, S1A Fig.), clustered nuclear E1A signals were apparent at 23 h.

      p.9: Which test was used to calculate the additional p-values?

      AU: As stated in the Material and Methods section or the figure legends, the p-values were calculated either by a permutation test using custom-programmed R-script (the code has been deposited on Mendeley Data along with other data associated with this manuscript), or by Kolmogorov-Smirnov test using GraphPad Prism. GraphPad Prism was also used to calculate Spearman’s correlation coefficients and the associated approximate p values. In the revised manuscript, we have added the following sentense into the Material and Methods section / Statistical analyses: Spearman’s correlation tests were done using GraphPad Prism.

      p.10: For the experiment for the correlation of viral genomes per cell and E1A transcripts in HDF-TERT cells (Figure S2C), the MOI is missing in the description of the results, as well as in the corresponding figure legends.

      AU: We have indicated the theoretical MOI (~ 4800 virus particles per cell) in the figure legend and in the Material and Methods section. The actual MOI, i.e. the actual number of virus particles entering into the cells, could not be determined due to the long (15 h) incubation time of virus inoculum with the cells, which in turn was required because these cells bind AdV-C5 rather inefficiently. However, between 1 and 32 EdC-labeled virus genomes were detected per cell nucleus at 22 h pi.

      11: calculation of correlation? rs? Why does the author combine S and G2/M phase? Fig. S3A show different values for the phases

      AU: rs is the abbreviation for Spearman’s correlation coefficient, and, as indicated in the Material and Methods, we used GraphPad Prism to calculate the Spearman’s correlation coefficients.

      Different methods to estimate cell cycle stages. DNA content method cannot separate S and G2/M with great confidence, whereas Kusabira Orange-hCdt1 and Azami-Green-hGeminin expressions in HeLa-Fucci cells allow more fine-tuned assessment of the cell cycle phases.

      p.11: "Thus, the total intensity of nuclear DAPI signal can be used to accurately assign G1 vs S/G2/M stage to cells." The authors should also here refer to other papers, which showed that this correlation is feasible, as they did in the methods section (67. Roukos V, Pegoraro G, Voss TC, Misteli T. Cell cycle staging of individual cells by fluorescence microscopy. Nature protocols. 2015;10(2):334-48. Epub 2015/01/31. doi: 10.1038/nprot.2015.016. PubMed PMID: 25633629; PubMed Central PMCID:PMCPMC6318798.), and maybe also refer to a newer paper which deals with this technique: Ferro, A., Mestre, T., Carneiro, P. et al. Blue intensity matters for cell cycle profiling in fluorescence DAPI-stained images. Lab Invest 97, 615-625 (2017). https://doi.org/10.1038/labinvest.2017.13

      AU: The integrated nuclear DAPI signal intensity is indeed a widely used method to assign cell-cycle stage to individual cells. We have added the second reference suggested by the Reviewer to the reference list for this method.

      p.11: "Furthermore, when focusing on the highest E1A expressing cells, i.e. the cells with mean cytoplasmic E1A intensities larger than 1.5 × interquartile range from the 75th percentile, 71.9% of these cells were found to be in the G1 phase of cell cycle, whereas only 55.8% of cells in the total sampled cell population were G1 cells." The authors do not provide any reference to a figure within the manuscript or the supplements, which contains these data. Are these data not shown in the manuscript?

      AU: These values are calculated from the data shown in Fig.3B. The source data supporting findings of this study (maximum projection images, excel files of the CellProfiler and Knime workflows) have now been deposited to Mendeley Data as stated in the Material and Methods / Data availability section of the revised manuscript and listed in Supplementary tables.

      p.12: punctuation mistake; . instead of , To enrich G1 cells. AdV-C-5 (moi ~ 36250) was added. Why does the author switch between signal intensities and counting E1A puncta per cell (limited to 200) in the different experiments to illustrate accumulation of E1A transcripts?

      AU: The same answer as above: the quantification method had to be adjusted to the number of virus transcripts in the cell at the time of analysis. The best quantification method is segmentation and counting the individual fluorescent puncta per cell, but, as stated in the manuscript, this method does not accurately quantify the mRNA puncta from maximum projections of confocal or widefield image stacks when the number of puncta per cell exceeds ~ 200. On the other hand, as shown in the quantification in the new S1C Fig., mean fluorescence intensity measurements per cell do not of course distinquish between cells having one vs. two mRNA puncta, but a relatively good correlation between puncta counting and fluorescence intensity measurements is achieved when cells have ≥ 10 transcripts per cell.

      p.14: "For E1A (or E1B-55K), we did not detect transcriptional bursts with bDNA-FISH probes on nuclear vDNAs, either prior to or after accumulation of viral transcripts in the cell cytoplasm." The authors do not provide any reference to a figure within the manuscript or the supplements, which contains these data. Are these data not shown in the manuscript?

      AU: This statement is based on hundreds of images we have analyzed during the course of the study. It is impossible to show all of these images, so in principle, this is “data not shown”. We have modified the text as follows: With hundreds of images analyzed, we never unambiguously detected transcriptional bursts with E1A (or E1B-55K) bDNA-FISH probes on nuclear vDNAs, either prior to or after accumulation of viral transcripts in the cell cytoplasm.

      p.14: space between number and %

      AU: Thank you for pointing this out. It has been corrected.

      p.15: "This is was also seen in AdV-C5-EdC-infected cells" should be changed to "This was also seen in AdV-C5-EdC-infected cells"

      AU: Thank you for pointing this out. It has been corrected.

      Fig. 1B:

      −figure legend does not indicate how cells were staine −also no description in the continuous text −which E1A transcripts are stained? all? 12S? 13S?

      AU: The first sentence in Results section states that “We used fluorescent in situ hybridization (FISH) with probes targeting E1A, E1B-55K and protein VI transcripts followed by branched DNA (bDNA) signal amplification to visualize the appearance and abundance of viral transcripts in AdV-C5-infected A549 lung carcinoma cells.” Furthermore, the legend to Figure 1 starts with the title “Visualization of AdV-C5 E1A, E1B-55K and protein VI transcripts in infected cells by bDNA-FISH technique”, and the legend to Fig.1B mentions that “cells were stained with probes against E1A and E1B-55K mRNAs or E1A and protein VI mRNAs”. We are of the opinion that this is enough information to understand the figures.

      The main text to Fig.1 also states that “The E1A probes covered the entire E1A primary transcript region and thus all E1A splice variants. The temporal control of E1A primary transcript splicing and E1A mRNA stability give rise predominantly to 13S and 12S E1A mRNAs at 5 h pi (references)”.

      Fig. 1D: −difference in accumulation of viral transcripts is not that visible as in IF staining (Fig. 1B; Fig. 1S);

      Fig. 1 or S1 Fig. do not show IF staining but signals from FISH.

      −graph does not show any difference between E1A and E1B-55K

      AU: The y-axes values in Fig.1D graph are arbitrary units and thus E1A and E1B-55K graphs are not directly comparable to each other. We have included into the revised manuscript S1B Fig., which shows quantification of E1A and E1B-55K fluorescent puncta per cell at the 5 h pi; the difference between E1A and E1B-55K was statistically significant.

      Fig. 1F: −figure legend does not fit with labelling of IF images and continuous text −description says 22 h, while IF labeling and text (p. 7, last lane) mentions 23 h pi

      AU: The figure annotations state the time of analyses as total time after virus addition to cells, whereas text stated the time of analyses as x h post virus removal since we wanted to stress that the input virus was incubated only for 1 h with the cells. However, Reviewers found this confusing, so we have changed the text in the revised manuscript so that time of analysis is stated as total time after virus addition to cells (as in the figure annotations). Only in the Material and Methods section we maintain the original 1 h + x h statement for the time of analysis.

      Fig. 2A: −figure legend: lane 5 Punctuation wrong: azide-Alexa Fluor488. Alexa Fluor647

      AU: Thank you for pointing this out. It has been corrected.

      Fig. 4A: −difficulties to understand −author stated that promoter-driven EGFP expression is clearly dominated by G1 cells for E1A and by S/G2/M cells for CMV, however this is not clearly visible in the graph −no severe differences visible between CMV-eGFP and E1A-eGFP −author should include numbers for quantification and statistical calculations to illustrate the differences

      AU: In the highest GFP expression bin, CMV-eGFP expressing cells have 43% cells in G1 and 50% in S/G2/M (n=2149). In comparison, E1A-GFP expressing cells have 58% cells in G1 and 35% in S/G2/M (n=2258). The difference in G1 cells in the highest eGFP bin is statistically significant (p

      Fig. 4B: −amount of E1A protein levels calculated via IF (signal intensities) −immunofluorescence is not a suitable tool for protein quantification

      AU: It is true that not all antibodies are suitable for IF (or for Western blot), and we cannot be certain that the monoclonal anti-E1A antibody used by us detects all E1A forms with different post-translational modifications with equal efficiency. However, IF is a widely accepted method to estimate protein levels in the cell, especially if the proteins like E1A accumulate in the nucleus (makes segmentation of the signal easy) and give a rather uniform nuclear staining pattern.

      Fig. 5: −in A. it is stated, that E1A bDNA -FISH is not suitable, since it is too short to be detectable. However, in B E1A bDNA-FISH is used. is there a difference? −according to the method part just one E1A mRNA was used for the assays, why is it then not possible to use that one in Fig. 5A? −explanation of the procedure and the experiment is very confusing

      AU: The Reviewer probably refers to Fig.6 here, not to Fig.5. The E1A introns are short (about 100 bases) and cannot be picked up with bDNA FISH probes. In Fig. 6B we were using the E1A bDNA-FISH probes, which were made against the AdV-C5 genome map positions 551-1630 to detect vDNA single strands of the E1A region and these single strands were long enough to be picked out by our E1A probes.

      Fig. S6B: −authors want to show that it is RNase-insensitive, but S1 nuclease-sensitive

      −two different A549 cell clones and two different time points are used for the treatments → not compareable to each other

      AU: This is a fair criticism. We have replaced the RNase A figure in S6B Fig. in the revised manuscript. The new RNase A experiment was carried out in ATCC A549 cells using the same infections conditions as with the S1 nuclease-treated cells.

      Material and Methods: −headings do not indicate which methods are explained −no clear structure AU: We have made minor changes to the headings of Material and Methods section. We have first explained in detail the bDNA-FISH method, but otherwise the order is according to the order of the figures.

      Reviewer #2 (Significance (Required)):

      highly significant manuscript very important for the virology field

      my research topics are human adenoviruses and their replication cycle

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      **Summary:** Soumalainen et al have studied adenovirus viral gene expression and replication at a single-cell level. They explore the extent of correlation between incoming genome copy number and early gene expression and progression into the late phase, revealing substantial variation between cells in the numbers of E1A transcripts (the first gene expressed upon infection) that is not explained by differences in the numbers of viral genome templates in the cells. They also explore the relevance of cell cycle stage to this variability and show a positive correlation between G1 cell cycle stage and higher levels of gene activity, which explains at least part of the variation. To form these conclusions they have applied new methods to visualise and quantify single molecules of nucleic acid in single cells. The experiments are all carefully and fully described with full detail of materials. Overall the manuscript is well written and easy to follow.

      **Major comments:**

      All of the experiments appear to be done with rigour and their results reported with due regard to statistical significance etc. My major concern though is that they have been done, perhaps out of necessity to get detectable signals, at very high multiplicities of infection. A well-accepted standard to achieve infection of all cells in a culture is an MOI of 10 infectious units per cell. Even this is acknowledged not to represent the biology of natural infection and it is striking that, where technically feasible, lower MOI studies are more revealing of how a virus actually works. Here, the authors have used counts of particles rather than infectious units to determine MOI and for Ad5, the particle/pfu ratio is typically 20-100. Their MOIs though are 13,000 - 50,000 per cell, implying an infectious MOI of at least 130 for their A549 experiments, which are known to be readily infected by Ad5 from other work.

      AU: Unlike common experiments done by others, we used a synchronized infection and removed the input virus after 1h incubation at 37°C. This type of infection initiation requires high input virus amounts, as opposed to studies in which the virus inoculum is incubated with cells for several hours/days, as is typically done in studies determining the infectious or plaque forming units in virus inoculum. Hence, the MOI used by others involved incubation of inoculum with cells over extended periods of time, and they cannot be compared to our pulsed infection conditions.

      Although the calculated theoretical MOIs (physical particles/cell) were high in our experiments, only 0.1% – 0.2% of input virus particles bound to cells during the 1h incubation period (Fig. 1 A; this estimation is based on the ratios between Median values for the number of cell-associated viruses vs input virus numbers).

      Furthermore, in the experiment described in Fig.2C (correlation of E1A transcripts per cell vs. viral genomes per cell), 42% of analyzed cells had ≤ 5 viral genomes/cell and 27.5% of analyzed cells had between 6-10 viral genomes per cell. Please note, that these are not high numbers.

      The input virus amounts used were selected this way, because we aimed at getting a broader view of how virus transcription at early phases of infection responds to a varying number of virus genomes delivered to the nucleus. Therefore, we did not limit the analyses to a situation with 1 or less than 1 virus particles/genomes per cell.

      In addition, the analyses of how cell cycle phase impacts the initiation of virus gene expression requires a relatively short time between virus inoculation and time point of analysis (i.e. a rather high MOI). Otherwise, as also pointed out by the Reviewer, the cells could have experienced more than one cell cycle phase during the duration of the experiment. Furthermore, although the initial natural infection probably starts with a very low MOI, the second round of infection is a high MOI infection due to a large number of progeny virus particles released from an infected cell.

      Surprisingly, the authors do not see intracellular vDNA copy numbers that are fully reflective of this high MOI, with median intracellular vDNA of 75 /cell at the highest MOI. The authors should consider how the population distribution of vDNA /cell does or does not fit the predicted Poisson distribution. Nonetheless, at these high copy numbers / cell, there must surely be a risk that the variation in gene expression activity arises stochastically, out of competition between genomes for essential transcription factors. Given that multiple cellular factors are each required for E1A transcription, high genome copy numbers could actually inhibit E1A expression relative to cells with more modest copy numbers because limited supplies of individual factors are recruited to different viral genome copies.

      AU: The “discrepancy” between theoretical MOI and the actual observed number of cell-associated virus particles or cell-associated virus genomes is explained above. Furthermore, we would like to point out that we have directly estimated the number of virus particles bound to cells with the input virus amounts used, something that is usually not done in other studies.

      It is indeed theoretically possible that high nuclear genome numbers could lead to inhibition of transcription due to competition for limiting essential host factors. However, if we included only cells with ≤4 vDNA molecules per nucleus into the analysis (total number of cells analyzed was 258), then Spearman’s correlation coefficient for vDNA per nucleus vs E1A mRNAs per cell was 0.186 (p=0.0027). Thus, this would not support the notion that cells with moderate nuclear vDNA copy numbers would have a better correlation between the nuclear vDNA copies vs E1A mRNA counts per cell.

      The vDNA/cell in Fig.2C does not fit predicted Poisson distribution, var/mean=9.129.

      It is important for the analysis of correlation of gene expression with cell cycle that the virus has not, at the time point analysed, already perturbed the cell cycle (a well-known effect of infection) which the authors document in Suppl Fig3B. To my eye, the G1 peak in infected cells is somewhat narrower than in the control while the S/G2 bump is a little greater. The % of cells in each of the two gates needs to be shown to support the conclusion.

      AU: In non-infected sample G1= 54.63% and S/G2/M = 45.37%, in infected cells G1= 51.4% and S/G2/M= 48.6%. We have added this information into the S3B Fig.

      Turning to the experiments documenting a correlation between E1A expression and cell cycle stage, the authors interpret their findings in terms of the stage the cells are at when the analysis was done (G1 stage cells have more E1A transcripts). The key experiment (Fig 3B) is analysed at only 4 h pi, so substantial progression from G2/M back to G1 after virus addition can probably be discounted, but the point should be discussed. The authors also use release from G1 in another cell line to support their argument that G1 supports higher levels of E1A expression (Fig 3C). Here, they elect to exclude all cells with fewer than 50 E1A transcripts from their analysis. The reason for this is completely obscure and isn't obviously justified; conceivably it could bias the outcome of the experiment. At minimum, this decision needs to be carefully explained; ideally, the full data set should be used.

      AU: Fig.3B: As suggested by the Reviewer, we have added to the main text the following explanation: “We used a high MOI infection (median 75 cell-associated virus particles, Fig. 1A) in order to achieve a rapid onset of E1A expression so that the time between virus addition and analysis was short. Thus, it is not expected that a substantial number of cells would have changed their cell cycle status during the experiment.”

      Fig.3C: We show the results also from the full data set of infected cells, i.e., cells with ≥ 1 E1A puncta in S3D Fig. We excluded the cells without zero E1A puncta because with these cells it is impossible to know whether they received no virus or whether E1A transcription had not yet started. Permutation test indicated that the difference between the starved+starved and starved+FCS is statistically significant even in this case. Because both samples are dominated by cells with low E1A counts, we log-transformed the E1A values for the box plot figure.

      The authors note the highest level of E1A activity (as opposed to RNA) was in G1/S cells and suggest that high E1A cells advance preferentially into S. Whilst in line with the literature that E1A promotes progression into S, an alternative explanation is simply that there is a time lag between RNA accumulation and protein accumulation, during which progression through the cycle would be expected.

      AU: This is a valid point, and we have modified the text as follows: “… which could reflect the advancement of high E1A expressing cells into S-phase. However, considering the time between virus addition and analysis (10.5 h), we cannot exclude the possibility that the observed G1/S preference is at least partly due to time-dependent progression of G1 cells to G1/S.”

      **Minor comments:** Fig 1 and elsewhere. Given that the 1 h incubations with virus were done at 37 C, the convention would be to include this period in the time post-infection at which harvest / fix time points are quoted. There is inconsistency between text and legend with 12 h pi being sometimes represented as 11 h after virus removal; this is an unnecessary confusion.

      AU: We have modified the text so that hours pi always include the 1h incubation with the input virus. Only in the Material and Methods section we kept the original 1h virus binding – fixing at xh post virus removal.

      Results description prior to the ref to Fig 1B: unclear what this is supposed to mean.

      AU: We have now slightly modified the first paragraph of the Results section. We mention the benefits of the bDNA signal amplification method and explain the experimental set up, i.e. that the input virus was incubated with the cells only for 1h. We also justify why we used a short incubation for the virus inoculum.

      Fig 4A: provide % of cells in each gate in each histogram.

      AU: In the highest GFP expression bin, CMV-eGFP expressing cells have 43% of cells in G1 and 50% in S/G2/M. In comparison, E1A-GFP expressing cells have 58% of cells in G1 and 35% in S/G2/M. This has been added to the figure, and it is also mentioned in the main text. Furthermore, we added to the text the results from Two Proportion Z-test to show that the proportion difference of G1 cells in the highest bin was statistically significant (p

      Fig 5: bottom right panel x axis label is wrong

      AU: Thank you for pointing out this. This has been corrected.

      In the presentation of Fig 6, it would be much clearer for the reader if the detected replication foci (ss DNA detected as E1A puncta) were referred to as something other than E1A puncta. There is too much scope for confusion with the earlier experiments in which E1A RNA was detected.

      AU: We agree. In the revised manuscript, we refer to these puncta in the text as E1A ssDNA-foci.

      Reviewer #3 (Significance (Required)):

      The study represents the application of state of the art single-molecule visualization techniques to an as yet not understood aspect of virus infection. That said, there is prior experimentation in this area, which the authors fully acknowledge and build upon. The new work is largely descriptive, in that it reveals very clearly the discrepancy between genome copy number and amounts of mRNA without seeking to explain these, beyond the cell cycle analysis. Whilst there is a better correlation between vDNA number and transcript once the data are stratified by cell cycle stage, it is still not strong (Fig 5), indicating that other substantial contributing factors remain to be described.

      The work will be of interest certainly to adenovirologists, but also to others who study virus infections - particularly nuclear-replicating DNA viruses such as herpesviruses - where similar considerations are likely to apply.

      Expertise: adenovirus; gene expression; virus-host interactions; molecular biology

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Summary: Soumalainen et al have studied adenovirus viral gene expression and replication at a single-cell level. They explore the extent of correlation between incoming genome copy number and early gene expression and progression into the late phase, revealing substantial variation between cells in the numbers of E1A transcripts (the first gene expressed upon infection) that is not explained by differences in the numbers of viral genome templates in the cells. They also explore the relevance of cell cycle stage to this variability and show a positive correlation between G1 cell cycle stage and higher levels of gene activity, which explains at least part of the variation. To form these conclusions they have applied new methods to visualise and quantify single molecules of nucleic acid in single cells. The experiments are all carefully and fully described with full detail of materials. Overall the manuscript is well written and easy to follow.

      Major comments:

      All of the experiments appear to be done with rigour and their results reported with due regard to statistical significance etc. My major concern though is that they have been done, perhaps out of necessity to get detectable signals, at very high multiplicities of infection. A well-accepted standard to achieve infection of all cells in a culture is an MOI of 10 infectious units per cell. Even this is acknowledged not to represent the biology of natural infection and it is striking that, where technically feasible, lower MOI studies are more revealing of how a virus actually works. Here, the authors have used counts of particles rather than infectious units to determine MOI and for Ad5, the particle/pfu ratio is typically 20-100. Their MOIs though are 13,000 - 50,000 per cell, implying an infectious MOI of at least 130 for their A549 experiments, which are known to be readily infected by Ad5 from other work.

      Surprisingly, the authors do not see intracellular vDNA copy numbers that are fully reflective of this high MOI, with median intracellular vDNA of 75 /cell at the highest MOI. The authors should consider how the population distribution of vDNA /cell does or does not fit the predicted Poisson distribution. Nonetheless, at these high copy numbers / cell, there must surely be a risk that the variation in gene expression activity arises stochastically, out of competition between genomes for essential transcription factors. Given that multiple cellular factors are each required for E1A transcription, high genome copy numbers could actually inhibit E1A expression relative to cells with more modest copy numbers because limited supplies of individual factors are recruited to different viral genome copies. It is important for the analysis of correlation of gene expression with cell cycle that the virus has not, at the time point analysed, already perturbed the cell cycle (a well-known effect of infection) which the authors document in Suppl Fig3B. To my eye, the G1 peak in infected cells is somewhat narrower than in the control while the S/G2 bump is a little greater. The % of cells in each of the two gates needs to be shown to support the conclusion.

      Turning to the experiments documenting a correlation between E1A expression and cell cycle stage, the authors interpret their findings in terms of the stage the cells are at when the analysis was done (G1 stage cells have more E1A transcripts). The key experiment (Fig 3B) is analysed at only 4 h pi, so substantial progression from G2/M back to G1 after virus addition can probably be discounted, but the point should be discussed. The authors also use release from G1 in another cell line to support their argument that G1 supports higher levels of E1A expression (Fig 3C). Here, they elect to exclude all cells with fewer than 50 E1A transcripts from their analysis. The reason for this is completely obscure and isn't obviously justified; conceivably it could bias the outcome of the experiment. At minimum, this decision needs to be carefully explained; ideally, the full data set should be used.

      The authors note the highest level of E1A activity (as opposed to RNA) was in G1/S cells and suggest that high E1A cells advance preferentially into S. Whilst in line with the literature that E1A promotes progression into S, an alternative explanation is simply that there is a time lag between RNA accumulation and protein accumulation, during which progression through the cycle would be expected.

      Minor comments:

      Fig 1 and elsewhere. Given that the 1 h incubations with virus were done at 37 C, the convention would be to include this period in the time post-infection at which harvest / fix time points are quoted. There is inconsistency between text and legend with 12 h pi being sometimes represented as 11 h after virus removal; this is an unnecessary confusion.

      Results description prior to the ref to Fig 1B: unclear what this is supposed to mean.

      Fig 4A: provide % of cells in each gate in each histogram.

      Fig 5: bottom right panel x axis label is wrong

      In the presentation of Fig 6, it would be much clearer for the reader if the detected replication foci (ss DNA detected as E1A puncta) were referred to as something other than E1A puncta. There is too much scope for confusion with the earlier experiments in which E1A RNA was detected.

      Significance

      The study represents the application of state of the art single-molecule visualization techniques to an as yet not understood aspect of virus infection. That said, there is prior experimentation in this area, which the authors fully acknowledge and build upon. The new work is largely descriptive, in that it reveals very clearly the discrepancy between genome copy number and amounts of mRNA without seeking to explain these, beyond the cell cycle analysis. Whilst there is a better correlation between vDNA number and transcript once the data are stratified by cell cycle stage, it is still not strong (Fig 5), indicating that other substantial contributing factors remain to be described.

      The work will be of interest certainly to adenovirologists, but also to others who study virus infections - particularly nuclear-replicating DNA viruses such as herpesviruses - where similar considerations are likely to apply.

      Expertise: adenovirus; gene expression; virus-host interactions; molecular biology

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      The authors show heterogeneity of AdV-C5 mRNA transcript quantity and dynamics in different cell types, which is regulated by the cell cycle phase and does not correlate to incoming viral DNA, using single molecule RNA FISH technologies and detection of incoming viral DNA by EdC labeling.

      Major Comments:

      The authors change the MOI used in their experiments (7 different MOIs are used throughout the paper) in a manner that appears randomly and without explanation. (54400 for Figure 1A, 1B, 3B, S3B; 37500 for Figure 1C; 23440 for Figure 2A, 2C, S5A; 13600 for Figure 1A, 1D; 36250 for Figure 3C, S3D; 11200 for Figure 4B; 23400 for Figure 6B). The authors should provide explanation, why these changes in MOIs are necessary. The authors use mean fluorescence intensity of E1A probes per cell as estimate for viral transcript abundance for some of their experiments (Figure 1D, E, 3B), and count E1A punctae as measure for E1A transcripts in other experiments (Figure 2C, 3C, 5), without showing data, that these measures correlate. Problematic is hereby, that not all E1A punctae have the same signal intensity, as can be seen in Figure S1, which makes the estimation of the correlation of E1A punctae (= number of transcripts) and fluorescence intensity difficult. The authors should provide both (E1A punctae counts and estimation via fluorescence intensity) for at least one experiment, to prove, that the estimation of E1A transcript levels via fluorescence intensity is feasible. p.15: "The nuclear E1A signals in AraC-treated cells were resistant to RNase A, but they were dampened by treatment with S1 nuclease (S6B Fig)." The authors make this statement based on (i) two completely different timepoints (12 h.p.i. for RNaseA treatment, 24.5 h.p.i. for S1 nuclease treatment) and (ii) in different clones of the A549 cells as stated in the methods section on p.21 (Two different clones of human lung epithelial carcinoma A549 cells were used in the study: our laboratory's old A549 clone (experiments shown in Fig. 1, Fig. 3B and S1 Fig., S3B and S3C Fig., S6A and S6B Fig., RNase A treatment) and A549 from American Type Culture Collection (ATCC, experiments shown in Fig. 2 and Fig. 5, Fig. 6, S2B Fig., S4 Fig., S5 Fig., and S6B Fig. S1 nuclease-treatment)). This makes it difficult to interpret, if the data is due to differences in the timepoints or cell types, or if it is due to binding of the E1A probe to single stranded vDNA.

      Minor Comments:

      p.4: "AdV are non-enveloped, double-stranded DNA viruses that cause mild respiratory infections in immuno-competent hosts, and establish persistent infections, which can develop into life-threatening infections if the host becomes immuno-compromised [reviewed in 6]." Not all AdV cause respiratory diseases, the disease outcome of human AdV depends on the site of primary infection, which differs between the different AdV types.

      p.7: The authors state, that "At the 17 h time point, about half of the cells had high numbers of protein VI transcripts, and most of them very high numbers of E1A transcripts.", however, the picture shown in Figure 1F shows a different phenotype, with low transcript levels of VI in E1A high cells and high transcript levels of VI in E1A low cells.

      p.8: "This nuclear E1A signal is due to binding of the E1A probe to single-stranded vDNA in the replication centers (see below)." The authors should state here, that due to the binding of the probes to the single stranded vDNA in the replication centers, the nucleus was excluded from the analysis for Figure 1F in late timepoints. Due to this time point the author cannot state that the E1A staining seen (Fig. 1F; indicated with white arrows) are replication centers; this is just an assumption, since there is no evidence in Fig 1 the author cannot be sure; the author should change the text: "taking the following experiments into account...", "due to further studies (see below)..... we assume that..." p.8: The authors should mention the figure they refer to, since there is no E1B-55K staining in Fig. 1F

      p.9: Which test was used to calculate the additional p-values?

      p.10: For the experiment for the correlation of viral genomes per cell and E1A transcripts in HDF-TERT cells (Figure S2C), the MOI is missing in the description of the results, as well as in the corresponding figure legends.

      p. 11: calculation of correlation? rs? Why does the author combine S and G2/M phase? Fig. S3A show different values for the phases

      p.11: "Thus, the total intensity of nuclear DAPI signal can be used to accurately assign G1 vs S/G2/M stage to cells." The authors should also here refer to other papers, which showed that this correlation is feasible, as they did in the methods section (67. Roukos V, Pegoraro G, Voss TC, Misteli T. Cell cycle staging of individual cells by fluorescence microscopy. Nature protocols. 2015;10(2):334-48. Epub 2015/01/31. doi: 10.1038/nprot.2015.016. PubMed PMID: 25633629; PubMed Central PMCID:PMCPMC6318798.), and maybe also refer to a newer paper which deals with this technique: Ferro, A., Mestre, T., Carneiro, P. et al. Blue intensity matters for cell cycle profiling in fluorescence DAPI-stained images. Lab Invest 97, 615-625 (2017). https://doi.org/10.1038/labinvest.2017.13

      p.11: "Furthermore, when focusing on the highest E1A expressing cells, i.e. the cells with mean cytoplasmic E1A intensities larger than 1.5 × interquartile range from the 75th percentile, 71.9% of these cells were found to be in the G1 phase of cell cycle, whereas only 55.8% of cells in the total sampled cell population were G1 cells." The authors do not provide any reference to a figure within the manuscript or the supplements, which contains these data. Are these data not shown in the manuscript?

      p.12: punctuation mistake; . instead of , To enrich G1 cells. AdV-C-5 (moi ~ 36250) was added. Why does the author switch between signal intensities and counting E1A puncta per cell (limited to 200) in the different experiments to illustrate accumulation of E1A transcripts?

      p.14: "For E1A (or E1B-55K), we did not detect transcriptional bursts with bDNA-FISH probes on nuclear vDNAs, either prior to or after accumulation of viral transcripts in the cell cytoplasm." The authors do not provide any reference to a figure within the manuscript or the supplements, which contains these data. Are these data not shown in the manuscript?

      p.14: space between number and %

      p.15: "This is was also seen in AdV-C5-EdC-infected cells" should be changed to "This was also seen in AdV-C5-EdC-infected cells"

      Fig. 1B:

      −figure legend does not indicate how cells were staine

      −also no description in the continuous text

      −which E1A transcripts are stained? all? 12S? 13S?

      Fig. 1D:

      −difference in accumulation of viral transcripts is not that visible as in IF staining (Fig. 1B; Fig. 1S);

      −graph does not show any difference between E1A and E1B-55K

      Fig. 1F:

      −figure legend does not fit with labelling of IF images and continuous text

      −description says 22 h, while IF labeling and text (p. 7, last lane) mentions 23 h pi

      Fig. 2A:

      −figure legend: lane 5 Punctuation wrong: azide-Alexa Fluor488. Alexa Fluor647

      Fig. 4A:

      −difficulties to understand

      −author stated that promoter-driven EGFP expression is clearly dominated by G1 cells for E1A and by S/G2/M cells for CMV, however this is not clearly visible in the graph

      −no severe differences visible between CMV-eGFP and E1A-eGFP

      −author should include numbers for quantification and statistical calculations to illustrate the differences

      Fig. 4B:

      −amount of E1A protein levels calculated via IF (signal intensities)

      −immunofluorescence is not a suitable tool for protein quantification

      Fig. 5:

      −in A. it is stated, that E1A bDNA -FISH is not suitable, since it is too short to be detectable. However, in B E1A bDNA-FISH is used. is there a difference?

      −according to the method part just one E1A mRNA was used for the assays, why is it then not possible to use that one in Fig. 5A?

      −explanation of the procedure and the experiment is very confusing

      Fig. S6B:

      −authors want to show that it is RNase-insensitive, but S1 nuclease-sensitive

      −two different A549 cell clones and two different time points are used for the treatments → not compareable to each other

      Material and Methods:

      −headings do not indicate which methods are explained

      −no clear structure

      Significance

      highly significant manuscript very important for the virology field

      my research topics are human adenoviruses and their replication cycle

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      The manuscript by Suomalainen et al. describes a fluorescence-based approach combined with high-resolution confocal microscopy to study the heterogeneity of adenovirus infection in a population of human cells. The main focus of the authors is the detection of viral transcripts in infected cells, how this correlates with viral genomes, the cell state, and how it varies between different cells in a single population. The paper is generally well written and easy to read, with a few typos, although I found parts of it to be somewhat length and repetitive. Particularly the results section could be pruned somewhat for readability and clarity. The major limitation of the study as it stands is it's overall impact and novelty, which limits journal selection somewhat. A very similar study was recently published, which the authors cite (Krzywkowski et al, 2017). Nevertheless, I think the study design is rigorous and well executed, but I do have some specific comments which may enhance it's overall impact and novelty.

      Major: Results "Visualization of AdV-C5..." section:

      Why not also look at normal cells that can be synchronized? Cancer cells, such as A549 will by definition be highly heterogenous and at all phases of the cell cycle. Primary non-transformed cells can easily be synchronized by contact inhibition and are much more physiologically relevant. "The virus particles bound..." - Can the spatial resolution of a confocal microscope truly differentiate individual particles that are sub-wavelength in size? What about the sensitivity for single particles? Some sort of experiment to show that single particles can be detected should be performed and shown to assure the readers that this is in fact possible. Furthermore, even when based on the particle to pfu ratio, the MOI would still be nearly 2000pfu/cell, so the actual number of observed particles is an order of magnitude lower than what was applied to the cells.

      Fig. 4 - I am not certain that the observed difference is significant, at least looking at it, beyond the width difference of the peaks, highest expression for both is largely in G1. It would be nice to see this using a western blot of cell cycle sorted cells, which can easily be accomplished using FACS. Page 15, 2nd paragraph. It would be valuable and informative to determine whether there is heterogeneity in histone association with these different vDNAs and whether these histones exhibit divergent modifications (enabling or restricting transcription). Same as above. I am rather surprised that the DBP signal did not correlate well with vDNA signal, particularly for the larger replication centers. How can this be reconciled? Was there an increase in overall vDNA signal later in infection? It is important to know this as it determines whether the observed vDNA signal is real or could be caused by viral RNA or other background causes (non-infected controls notwithstanding). Can the signal be detected with inactivated viruses (via UV for example?)

      Page 18, 1st paragraph. It would be interesting to determine whether there was association between pol II and those genomes that showed no E1A, similarly to the histone suggestion. What about things like viral chromatin organization? Soriano et al. 2019 showed how E1A and E4orf3 work in tandem to alter viral chromatin organization by varying histone loading on the viral genome. Fig. 2. Can you really say that a single dot correlates with a single transcript? Has that been validated in any way?

      Minor:

      Page 5, last paragraph. "Transcirpts from the viral late transcription unit,..." This is not correct as recently shown by Crisostomo et al, 2019.

      Page 10, "... because AdvV-infected cells are less well adherent..." This is not strictly true as loss of attachment only occurs later on in infection. It would be helpful to have statistical significance indicated directly in the figures.

      The very high MOIs used are concerning, could these have negative effects on the cell viability or overall state?

      There are a few typos and such that should be corrected.

      Significance

      As I stated above, the work is interesting and significant, to a degree. The major limitation is that the novelty is low as a paper published in 2017 (cited by the authors) used a very similar approach to investigate a similar problem. In addition, there are multiple other recent papers looking at cell populations in the context of adenovirus infection, and whether a single cell or population based approach is better is unclear. This is something the authors might want to strengthen prior to submission.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      First of all, we thank all reviewers for their constructive suggestions and comments.

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      This group has been at the forefront recently of using imaging technologies to understand how chromosome segregation is coordinated in mammalian oocytes, and why errors occur. In the current paper they examine the dynamics of microtubule organising centres (which effectively replace centrioles/centrosomes in oocytes) in MI. The imaging of oocytes in this paper is beautiful. The major findings are (1) that MTOCs that are supposed to be at the spindle pole sometimes end up at the spindle equator, and this is documented very beautifully and (2) the correct positioning of MTOCs at the spindle pole appears to require kinetochore microtubules, as indicated by experiments manipulating the kinetochore component NDC80.

      We appreciate the reviewer’s comment and clear description of our study.

      **Major Comments**

      As such the major claims of the paper are basically well supported. However, the analyses are is almost entirely restricted to prometaphase/metaphase, and the conclusions are relatively limited. The salient omission is any analysis of MTOC/chromosome relationship during anaphase. Were the paper to be extended to determine whether the lingering of MTOCs at the spindle equator is related to chromosome segregation error, that would increase the reach and importance of the work substantially. Specifically:

      Can tracking experiments be performed to determine whether the chromosome that shows movement similarities to the errant MTOC is more/less likely to missegregate? Complete tracking as these authors are expert at should achieve this, or photo-labelling the desired chromosome.

      Thank you for your comment. In our experimental system, oocytes rarely exhibit chromosome segregation errors (

      Can the position of MTOCs (proportion that linger at the equator) be manipulated in the absence of other defects to determine whether this increases errors (lagging at anaphase, metaphase-II chromosome counting spreads)?

      We agree with the reviewer that a specific manipulation of MTOC positions is exactly what we would need to investigate the significance of central MTOCs. Unfortunately, there are currently no tools available to specifically manipulate MTOC positions without other defects. Therefore, the significance of central MTOCs is currently unclear. In the revised manuscript, we will state these points in Discussion.

      The above analysis would have to be well supported by controls showing that these constructs are having no impact on normal anaphase (proportion of oocytes completing meiosis-I, likelihood of lagging chromosomes etc).

      Thank you for the comment. As we answered above, control oocytes rarely exhibit chromosome segregation errors or lagging chromosomes (

      Related to the above, though I appreciate a fixed metaphase image of MTOC immunofluorescence is presented, the paper is about the dynamics of MTOCs and thus nonetheless relies heavily on the live imaging of cep192. The core results should be confirmed using another (substantially different) MTOC probe. *This final comment applies to the current metaphase data, regardless of whether the study is ultimately extended*

      Thank you for the suggestion. We will confirm the dynamics of MTOCs at metaphase with mEGFP-Cdk5Rap2, another established marker of MTOCs.

      Reviewer #1 (Significance (Required)):

      As explained above, as presented this paper is largely scientifically sound, but far more limited in scope than this groups other recent papers. As explained above, the paper would be made more impactful and the readership broadened if a relationship between MTOC position/movement and segregation problems were established. Or on the other hand if it were established why some MTOCs sometimes linger at the spindle equator. Whilst to my knowledge this is the first time that equator MTOCs have been documented so carefully, oocyte cell biologists may not find the core observation that MTOCs are occasionally at the spindle equator extremely surprising.

      Thank you for your helpful suggestions. Due to lack of tools to specifically manipulate MTOC positions, we are unfortunately not able to directly address whether MTOC position/movement contributes to chromosome segregation problems. On the other hand, we are currently investigating to answer your important question ‘why some MTOCs sometimes linger at the spindle equator’. We speculate that MTOCs become central due to unstable kinetochore-microtubule attachments, which are predominantly observed at early metaphase in normal oocytes. To test this idea, we are currently investigating whether the appearance of central MTOCs are prevented by forced stabilization of kinetochore-microtubule attachments with Ndc80-9A. Our pilot analysis thus far supports this idea. In light of your suggestions, we will incorporate the results into the revised manuscript.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      I am commenting on the work of Courtois et al. as an expert in the biochemistry of spindle formation with a focus on acentriolar assembly.

      First and foremost, this a technically excellent study with a number of very interesting and well-documented observations, which are highly relevant for our understanding of the mechanisms of acentriolar spindle formation in the mouse oocyte model. In principle, the manuscript is in a very mature state. However, my major concern at this point would be that there is a break in the story. It starts describing the (very interesting) observation of "central MTOCs". After thoroughly investigating how these behave, the authors stop and look at overall MTOCs distribution after loss of stable MT-kinetochore interactions based on oocytes expressing the Ndc80_9D mutant instead of wt Ndc80. The two parts are experimentally and conceptually not well connected.

      We appreciate your comments on our techniques and novel observations in this study, and thank you for your helpful suggestions.

      Answering the following questions may help to further develop the paper:

      If I understand the arguments correctly, central MTOCs are an "accident" on the way to complete meiosis I spindle formation, which will eventually be corrected and all MTOCs clustered at the poles. Thus, they may serve as an assay for spindle assembly fidelity and kinetics (?). At this point, the reader is left with the observation without efforts to explain the meaning of this observation, ideally experimentally, or at least in a valid discussion.

      Thank you for your thoughtful comment. We agree that we should clearly explain our view on central MTOCs. We indeed speculate that central MTOCs are an “accident” due to unstable kinetochore-microtubule attachments, which are normally pronounced at early metaphase.

      We will revise the manuscript as follows: (1) Following the section for the observation of central MTOCs, we will state our hypothesis that central MTOCs may appear due to unstable kinetochore–microtubule attachments. (2) We will introduce our experiment of the manipulation of kinetochore–microtubule attachment stability as a test for our hypothesis. (3) We will present new results of our analysis for the effects of kinetochore–microtubule attachment stability on the appearance of central MTOCs (please see below).

      Enthusiasm for the technically excellent experiments using the Ndc80 variants are somewhat reduced as conclusions from these experiments are published in the parallel paper of the same laboratory (Yoshida et al.). Due to my opinion, it may thus be even more important to connect these observations with the first part described central MTOCs and to clarify their significance.

      Thank you for the important suggestion.

      First, we agree that we should connect our observations of central MTOCs to the phenotypes of Ndc80 manipulations. To do this, we will reanalyze our dataset to quantify the effects of Ndc80 manipulations on central MTOCs. Our pilot analysis thus far suggests that the forced stabilization of kinetochore–microtubule attachments by Ndc80-9A reduces the appearance of central MTOCs. This would support our idea that central MTOCs appear due to unstable kinetochore–microtubule attachments.

      Second, we agree with the reviewer that experimental clarification of the significance of central MTOCs would be nice. However, as outlined above, we unfortunately have no tool to directly address the significance of MTOC positioning in the fidelity of spindle assembly and chromosome segregation. Although we assume that MTOC positioning is critical for spindle assembly fidelity, as generally thought based on previous studies (Breuer et al., 2010; Clift and Schuh, 2015; Schuh and Ellenberg, 2007), the significance of MTOC positioning in spindle assembly remains uncertain, as you (and also the reviewer 1) point out. We will discuss these points in the revised manuscript.

      Shown if in Fig. 3B but not fully explained: How does the distribution of what is defined as central MTOCs behave in Ndc80_wt and Ndc80_9A mutant oocytes? Do the variants differ, i.e. are there fewer, or less persistent central MTOCs in the 9A mutant? Would they differ in kinetics of appearance and "rescue" to the poles?

      Thank you for the question. As outlined above, we will reanalyze our dataset to quantify the effects of Ndc80-9A on the behavior of central MTOCs. Our pilot analysis suggests that the forced stabilization of kinetochore–microtubule attachments suppresses the appearance of central MTOCs.

      Similarly: is there a correlation of central MTOC appearance, Ndc80 phosphorylation/stability of kinetochore attachment and Anaphase I onset? The authors mention that oocytes expressing the 9A mutant go faster into Anaphase.

      Thank you for this comment. First, we will investigate whether the levels of Ndc80 phosphorylation at kinetochores has any correlations to the distance to central MTOCs. Second, we will address whether microtubules connect kinetochores to central MTOCs. Third, we will perform the tracking of chromosomes that showed correlated motions to closely positioned MTOCs until anaphase onset.

      The observation that "central MTOCs exhibited correlated motions with closely positioned kinetochores" is poorly defined, yet an important observation. Does this mean some sort of short k-fibers remain to connect central MTOCs and kinetochores? Wouldn't one expect that the loss of stable end-on-attachment causes MTOCs to become central? How does this fit into a/the model?

      We believe these concerns will be addressed by the experiments/analyses proposed above. First, we will check if central MTOCs are connected to kinetochores by microtubules. Second, we indeed speculate that loss of stable kinetochore-microtubule attachment allows MTOCs to become central. We will test this idea by quantifying the appearance of central MTOCs in Ndc80-9A-expressing oocytes.

      Along the same lines: The authors hype their conclusion that kinetochores dominate meiosis I spindle formation based on the observation that loss of kinetochore functions results in less well-organized spindle poles and worse MTOC "confinement". This may mean that kinetochores, together with MTOCs, maintain stable k-fibers in meiosis, as shown here and in Yoshida et al. When one or the other end of k-fibers is destabilized (loss of end-on-attachment, loss of MTOC attachment), the fibers collapse and the remaining minus-or-plus-end associated structure loses its destination. We then see central MTOCs and/or kinetochores at poles. In this respect, the interpretation / discussion should be less "kinetochore-centered".

      We agree with your thoughtful comment that the regulations of minus-ends (e.g. MTOCs) and of plus-ends (e.g. kinetochores) are equally relevant for spindle bipolarization. We will tone down our kinetochore-centered view in the Abstract and Discussion and revise them into more balanced statements.

      Is there any way to determine the efficiency of Ndc80 knockdown in the gene replacement respective experiment? I share the view of the authors that their method may be more efficient and may explain apparent discrepancies to previous studies on Ndc80-9A (Guy and Homer, 2013) with more dramatic effects on spindle geometry. However, at that point, this remains speculative. For instance, one may also speculate vice versa that the ko strategy used here is less efficient in a maternally dominated system and leaves behind more wt Ndc80, which better compensates defects seen in the 9A mutant.

      Our gene deletion strategy (Zp3-Cre Ndc80f/f) resulted in >90% depletion of the Ndc80 protein (estimated by Western blot; Supplementary Figure 1c in Yoshida et al, Nat Commun 2020). On the other hand, Gui and Homer report that their morpholino-based depletion strategy resulted in 60–70% depletion of the Ndc80 protein (estimated by Western blot; Figure 1B in Gui and Homer, Dev Cell 2013). Thus, the depletion was more efficient in our experimental system. We will add this information in the manuscript.

      Reviewer #2 (Significance (Required)):

      Courtois et al present data on mechanisms governing spindle assembly in mouse oocytes. Mouse oocytes serve as model system for spindle formation in the absence of centriole-based MTOCs. At the onset of meiosis I, numerous MTOCs form, which shape a mass ("ball") of MT nucleated around chromatin into a bipolar structure. Accumulating evidence indicates that kinetochores play an important role in acentriolar spindle formation in mouse oocytes, yet the mechanisms behind kinetochore action remains unclear.

      Here, Courtois et al. analyze spindle formation in live mouse oocytes using 3D-time-lapse imaging. They use fluorescently tagged Cep192 to track MTOCs and Histone H2B or CENP-C to visualize chromatin or kinetochores. In the first part, the authors deal with the appearance of "central MTOCs", i.e. aggregates of centrosomal protein(s) that, apparently, fail to remain stably integrated into the spindle pole clusters on MTOCs during spindle formation. The authors convincingly demonstrate that these central MTOCs can be seen in the majority of spindles investigated. They demonstrate that central MTOCs generally come from positions at poles from where they "fall back" towards chromosomes. Central MTOCs may even cross the spindle and end up at opposite poles from where they originated from. Interestingly, central MTOCs are often found next to kinetochores.

      In the second part, the authors focus on the role of kinetochores and their stable MT attachment for spindle formation in general and bipolarity/pole organization in particular. The same lab has published data on the role of kinetochores in meiosis I spindle very recently (Yoshida et al. Nat Comm, 2020). Here, they successfully exploit Ndc80 phospho-mutants to compare MTOC distribution in oocytes with reduced or increased end-on-attachment. The data show that stable end-on attachment determines stable MTOC clustering at spindle poles and governs the maintenance of bipolarity and spindle length.

      Thank you for your clear description of our study.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      In order to assemble a bipolar structure, acentrosomal spindles relay on multiple non-centrosomal pathways. Mouse oocytes specifically build bipolar spindles by sorting and clustering of microtubule organizing centers (MTOCs). While microtubule cross-linkers, spindle motors and microtubule nucleators are involved; the role of kinetochores and kinetochore-microtubule attachments in meiotic spindle assembly and maintenance has not been thoroughly tested. Using an impressive combination of live cell imaging and semi-automated image analysis, Courtois et al. quantified MTOC behavior in bipolar mouse oocyte spindles and found an ongoing MTOC sorting in metaphase and instances of MTOC-kinetochore associations. The authors further employed an elegant genetic system to replace NDC80 in maturing oocytes with a mutant almost completely unable to form stable microtubule-kinetochore attachments. The data show lack of MTOC confinement at the spindle poles and increased spindle elongation while maintaining spindle bipolarity. The authors concluded that stable kinetochore-microtubule attachments are required to confine MTOCs at the poles, which in turn sets an optimal spindle length. Overall, the data are of very high quality and clearly presented, the manuscript is easy to follow, and the methods are comprehensively described. One concern is the lack of mechanistic link between the natural metaphase MTOC sorting (Fig. 1-2) and massive MTOC rearrangements observed with the NDC80-9D mutant (Fig. 3). A second concern is that deficient MTOC confinements and spindle elongation observed with the 9D mutant could be due unaligned chromosomes rather than lack of stable kinetochore-microtubule attachments, which is the authors' interpretation.

      **Major Points:**

      1) Massive MTOC rearrangements (Supplementary Video 6) are reminiscent of spindle assembly defects or spindle collapse. Since these spindles do not reach a normal metaphase and seem to change shape (Supplementary Video 6; 11:10), it is difficult to differentiate between spindle assembly and spindle maintenance defects. Is there a difference in the timing of bipolar spindle assembly for NDC80-9D vs WT? If so, one interpretation is that stable attachments not only ensure MTOC confinement but also contribute to bipolar spindle assembly.

      We apologize for the lack of explanation for the spindle dynamics seen in Supplementary Video 6, 11:10. At this time point, the spindle rotated in 3D, which appeared as if the spindle collapsed in the z-projection movie. We will add this explanation into the legend.

      Our quantitative analysis of spindle shape in 3D indicated no increased collapse in Ndc80-9D, based on the signals of the spindle marker EGFP-Map4. Moreover, we observed no detectable difference in the timing of the onset of bipolar spindle assembly, as long as we define it with EGFP-Map4 signals. These results are shown in Figure 4B.

      2) Fig. 1-2 vs Fig. 3 - It is not clear how the discrete MTOC sorting phenotype presented in Fig. 1-2 relates to the massive MTOC collapse shown in Fig. 3. The natural MTOC sorting and MTOC-kinetochore associations seem to be happening within the bipolar structure confined by the polar MTOCs. The MTOC rearrangements (e.g., Supplementary Video 6) are much more drastic, reminiscent of a spindle collapse. To make a mechanistic link between the phenotypes, it would be useful to use an intermediate NCD80 mutant (ex. NDC80-4D; Zaytsev et al., 2014 JCB) that may support chromosome alignment and maintenance of the canonical bipolar spindle structure, but still show effects on MTOC sorting.

      Thank you for your nice suggestion. We will test Ndc80-4D. The construct is ready.

      3) Fig. 4 - The authors should provide evidence that unstable kinetochore-microtubule attachments, rather than chromosome-derived signals of misaligned chromosomes (e.g., from Ran or Aurora B), limit spindle elongation. For example, the authors could measure spindle elongation in oocytes with misaligned chromosomes but stable attachments: for example, NDC80-9A oocytes released from an Eg5 inhibition block should carry a number of polar chromosomes with stable attachments. The expectation would be that such spindles form with confined MTOCs and do not elongate as much as NDC80-9D expressing oocytes.

      Thank you for this important suggestion. Following your suggestion, we have conducted a pilot experiment using monastrol washout. However, unfortunately, we did not observe increased chromosome misalignment in Ndc80-9A. We will play around experimental conditions.

      Moreover, we propose to perform an additional experiment. We will use cohesin depletion with Rec8 TRIM-Away, which will produce chromosome misalignment and reduce kinetochore-microtubule attachment stability. We expect that these oocytes exhibit excessive spindle elongation. Then, we ask if Ndc80-9A, which would force to stabilize kinetochore-microtubule attachment (but fail to align chromosomes due to loss of chromosome cohesion), can suppress excessive spindle elongation.

      These experiments will allow us to address direct contribution of kinetochore-microtubule attachment to proper spindle elongation. However, in our opinion, regardless of the results, we cannot exclude the possibility that chromosome alignment contributes to proper spindle elongation, which is indeed an intriguing hypothesis. We will discuss these possibilities in Discussion.

      4) Figure 5D - The authors' model suggests that MTOCs are confined due to their connection to stably attached k-fibers. It would be useful to speculate on the molecular mechanism behind the confinement. Does a maximal k-fiber length restrict the elongation, or is there a pulling force exerted by the kinetochores?

      Thank you for your thoughtful suggestion. As the reviewer suggests, we speculate that the length of k-fibers is critical for restricting MTOC position and spindle elongation. K-fibers may prevent excessive spindle elongation by anchoring MTOCs at their minus ends. Alternatively, k-fibers may act as a platform that inactivates spindle bipolarizers. We will discuss these possibilities in our revised manuscript.

      5) Discussion - Lines 203-204 - "The findings of this study, together with recent studies, suggest a model for how kinetochore-microtubule attachments contribute to acentrosomal spindle assembly (Figure 5D)". - Throughout the paper the authors underscore that biopolar spindles do assembly with the NDC80-9D mutant. The authors should clarify whether spindle assembly is affected by the NDC80-9D mutant or not?

      Thank you for your comment. We agree with the reviewer that we should clearly state our conclusion based on the phenotype of the Ndc80-9D mutant. Our conclusion is that stable kinetochore-microtubule attachment fine-tunes bipolar spindle assembly. If oocytes lack stable attachments, they can form a bipolar-shaped spindle composed of microtubule arrays that are largely bipolar, but the spindle becomes too much elongated and lacks MTOCs at its poles. We will explicitly state these ideas in our revised manuscript.

      **Minor Points:**

      1) Introduction - Lines 38-44 - The authors should cite the role of the Augmin complex in acentrosomal spindle assembly (Watanabe et al., 2016 Cell Reports).

      Thank you for your excellent suggestion. We will cite this relevant paper.

      2) Results - Lines 55-56 - "However, the precise manipulation of the stability of kinetochore-microtubule attachments has not been tested" - Gui et Homer 2013 studied the outcome of NDC80 depletion and tested the NDC80-9A mutant in the context of oocyte spindle assembly. Although, as the authors point out in the Discussion section, there might be differences in the experimental design that lead to different conclusions, it is not entirely accurate that precise manipulations of attachments stability have not been tested. A different wording (e.g., "has not been comprehensively tested") may be better.

      Thank you for your suggestion. We agree that “has not been comprehensively tested” fits better.

      3) Results - Lines 162-164 - "Ndc80-9D-expressing oocytes had no significant delay in the onset of spindle elongation, but had significantly faster kinetics of elongation compared to Ndc80-WT- and Ndc80-9D-expressing oocytes" - The authors probably meant "... Ndc80-9A expressing oocytes."

      Thank you for pointing out this mistake. We will correct it.

      4) Discussion - Lines 239-242 - "... microtubule nucleation in later stages may not be determined by MTOCs but are largely attributed to nucleation within the spindle, as observed by microtubule plus-end tracking in bipolar-shaped spindles (Supplementary Figure 4)." - Strictly speaking, EB3 comets indicate microtubule polymerization rather than nucleation. Microtubule nucleation within the spindle is, however, supported by studies of the Augmin complex (e.g., Watanabe et al., 2016 Cell Rep).

      Thank you for your comment. We will correct our wording for EB3 comets and discuss that microtubule nucleation within the spindle is shown in Watanabe et al., 2016 Cell Rep.

      5) Discussion - Lines 257-260 - "The lagging MTOCs can be positioned close to kinetochores on bi-oriented chromosomes, underscoring the importance of active error corrections of kinetochore-microtubule attachments during metaphase (Lane and Jones, 2014; Yoshida et al., 2015)." - The reasoning here is not clear. Does the number/persistence of lagging MTOCs correlate with chromosome mis-alignment or with the efficiency/timing of chromosome alignment in WT cells?

      We apologize that our discussion was not clear. Previous studies (Lane and Jones, 2014; Yoshida et al., 2015) show that kinetochore-microtubule attachment errors are found on aligned chromosomes during metaphase and must be corrected until anaphase onset in oocytes. We speculate that lagging (or central) MTOCs may be a source of such kinetochore-microtubule attachment errors, although we cannot directly test this hypothesis due to lack of tools to specifically manipulate MTOC positions. We will discuss these points in Discussion.

      To check if central MTOCs are correlated with chromosome misalignment, we will perform the tracking of chromosomes that were closely positioned to lagging MTOCs.

      6) Discussion - Line 266 - "Yoshida et al., 2020" - This article is cited elsewhere in the text as "Yoshida et al., in press".

      Thank you for pointing out these mistakes. We will correct them.

      Reviewer #3 (Significance (Required)):

      Courtois et al., have found a new mechanism contributing to acentrosomal spindle assembly in mouse oocytes. Although kinetochore-dependent spindle assembly occurs in mitotic cells (e.g., Toso et al., 2009 JCB), only the recent work from the Kitajima lab (Yoshida et al., 2020 Nat Comm; this manuscript) showed that kinetochores also impact acentrosomal spindle assembly in meiosis. The genetic model presented here brings a significant technical advance in dissecting relative contributions of spindle assembly pathways in mouse oocytes (ex. Schuh and Ellenberg 2007 Cell; Watanabe et al., 2016 Cell Rep; Drutovic et al., 2020 EMBO J) and complements current methods used to study meiotic error-correction (e.g., Chmatal et al., 2015 Curr Biol, Yoshida et al., 2015 Dev Cell; Vallot et al., 2018 Curr Biol and many others). This model expands an existing toolbox of techniques allowing complete elimination of the endogenous protein specifically in mature mouse oocytes (Clift et al., 2017 Cell; Clift et al., 2018 Nat Protocols), which is a difficult feat due to a limited capacity of ex-vivo culture (Pfender et al., 2015 Nature). Therefore, the work presented in this manuscript may encourage other researchers to establish similar systems for oocyte-specific manipulations, which will allow more precise insight into oocyte biology.

      Expertise keywords: spindle dynamics, chromosome segregation, mitosis, meiosis

      We appreciate your comments. Additional experiments following on your constructive comments will further improve our manuscript.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      In order to assemble a bipolar structure, acentrosomal spindles relay on multiple non-centrosomal pathways. Mouse oocytes specifically build bipolar spindles by sorting and clustering of microtubule organizing centers (MTOCs). While microtubule cross-linkers, spindle motors and microtubule nucleators are involved; the role of kinetochores and kinetochore-microtubule attachments in meiotic spindle assembly and maintenance has not been thoroughly tested. Using an impressive combination of live cell imaging and semi-automated image analysis, Courtois et al. quantified MTOC behavior in bipolar mouse oocyte spindles and found an ongoing MTOC sorting in metaphase and instances of MTOC-kinetochore associations. The authors further employed an elegant genetic system to replace NDC80 in maturing oocytes with a mutant almost completely unable to form stable microtubule-kinetochore attachments. The data show lack of MTOC confinement at the spindle poles and increased spindle elongation while maintaining spindle bipolarity. The authors concluded that stable kinetochore-microtubule attachments are required to confine MTOCs at the poles, which in turn sets an optimal spindle length. Overall, the data are of very high quality and clearly presented, the manuscript is easy to follow, and the methods are comprehensively described. One concern is the lack of mechanistic link between the natural metaphase MTOC sorting (Fig. 1-2) and massive MTOC rearrangements observed with the NDC80-9D mutant (Fig. 3). A second concern is that deficient MTOC confinements and spindle elongation observed with the 9D mutant could be due unaligned chromosomes rather than lack of stable kinetochore-microtubule attachments, which is the authors' interpretation.

      Major Points:

      1) Massive MTOC rearrangements (Supplementary Video 6) are reminiscent of spindle assembly defects or spindle collapse. Since these spindles do not reach a normal metaphase and seem to change shape (Supplementary Video 6; 11:10), it is difficult to differentiate between spindle assembly and spindle maintenance defects. Is there a difference in the timing of bipolar spindle assembly for NDC80-9D vs WT? If so, one interpretation is that stable attachments not only ensure MTOC confinement but also contribute to bipolar spindle assembly.

      2) Fig. 1-2 vs Fig. 3 - It is not clear how the discrete MTOC sorting phenotype presented in Fig. 1-2 relates to the massive MTOC collapse shown in Fig. 3. The natural MTOC sorting and MTOC-kinetochore associations seem to be happening within the bipolar structure confined by the polar MTOCs. The MTOC rearrangements (e.g., Supplementary Video 6) are much more drastic, reminiscent of a spindle collapse. To make a mechanistic link between the phenotypes, it would be useful to use an intermediate NCD80 mutant (ex. NDC80-4D; Zaytsev et al., 2014 JCB) that may support chromosome alignment and maintenance of the canonical bipolar spindle structure, but still show effects on MTOC sorting.

      3) Fig. 4 - The authors should provide evidence that unstable kinetochore-microtubule attachments, rather than chromosome-derived signals of misaligned chromosomes (e.g., from Ran or Aurora B), limit spindle elongation. For example, the authors could measure spindle elongation in oocytes with misaligned chromosomes but stable attachments: for example, NDC80-9A oocytes released from an Eg5 inhibition block should carry a number of polar chromosomes with stable attachments. The expectation would be that such spindles form with confined MTOCs and do not elongate as much as NDC80-9D expressing oocytes.

      4) Figure 5D - The authors' model suggests that MTOCs are confined due to their connection to stably attached k-fibers. It would be useful to speculate on the molecular mechanism behind the confinement. Does a maximal k-fiber length restrict the elongation, or is there a pulling force exerted by the kinetochores?

      5) Discussion - Lines 203-204 - "The findings of this study, together with recent studies, suggest a model for how kinetochore-microtubule attachments contribute to acentrosomal spindle assembly (Figure 5D)". - Throughout the paper the authors underscore that biopolar spindles do assembly with the NDC80-9D mutant. The authors should clarify whether spindle assembly is affected by the NDC80-9D mutant or not?

      Minor Points:

      1) Introduction - Lines 38-44 - The authors should cite the role of the Augmin complex in acentrosomal spindle assembly (Watanabe et al., 2016 Cell Reports).

      2) Results - Lines 55-56 - "However, the precise manipulation of the stability of kinetochore-microtubule attachments has not been tested" - Gui et Homer 2013 studied the outcome of NDC80 depletion and tested the NDC80-9A mutant in the context of oocyte spindle assembly. Although, as the authors point out in the Discussion section, there might be differences in the experimental design that lead to different conclusions, it is not entirely accurate that precise manipulations of attachments stability have not been tested. A different wording (e.g., "has not been comprehensively tested") may be better.

      3) Results - Lines 162-164 - "Ndc80-9D-expressing oocytes had no significant delay in the onset of spindle elongation, but had significantly faster kinetics of elongation compared to Ndc80-WT- and Ndc80-9D-expressing oocytes" - The authors probably meant "... Ndc80-9A expressing oocytes."

      4) Discussion - Lines 239-242 - "... microtubule nucleation in later stages may not be determined by MTOCs but are largely attributed to nucleation within the spindle, as observed by microtubule plus-end tracking in bipolar-shaped spindles (Supplementary Figure 4)." - Strictly speaking, EB3 comets indicate microtubule polymerization rather than nucleation. Microtubule nucleation within the spindle is, however, supported by studies of the Augmin complex (e.g., Watanabe et al., 2016 Cell Rep).

      5) Discussion - Lines 257-260 - "The lagging MTOCs can be positioned close to kinetochores on bi-oriented chromosomes, underscoring the importance of active error corrections of kinetochore-microtubule attachments during metaphase (Lane and Jones, 2014; Yoshida et al., 2015)." - The reasoning here is not clear. Does the number/persistence of lagging MTOCs correlate with chromosome mis-alignment or with the efficiency/timing of chromosome alignment in WT cells?

      6) Discussion - Line 266 - "Yoshida et al., 2020" - This article is cited elsewhere in the text as "Yoshida et al., in press".

      Significance

      Courtois et al., have found a new mechanism contributing to acentrosomal spindle assembly in mouse oocytes. Although kinetochore-dependent spindle assembly occurs in mitotic cells (e.g., Toso et al., 2009 JCB), only the recent work from the Kitajima lab (Yoshida et al., 2020 Nat Comm; this manuscript) showed that kinetochores also impact acentrosomal spindle assembly in meiosis. The genetic model presented here brings a significant technical advance in dissecting relative contributions of spindle assembly pathways in mouse oocytes (ex. Schuh and Ellenberg 2007 Cell; Watanabe et al., 2016 Cell Rep; Drutovic et al., 2020 EMBO J) and complements current methods used to study meiotic error-correction (e.g., Chmatal et al., 2015 Curr Biol, Yoshida et al., 2015 Dev Cell; Vallot et al., 2018 Curr Biol and many others). This model expands an existing toolbox of techniques allowing complete elimination of the endogenous protein specifically in mature mouse oocytes (Clift et al., 2017 Cell; Clift et al., 2018 Nat Protocols), which is a difficult feat due to a limited capacity of ex-vivo culture (Pfender et al., 2015 Nature). Therefore, the work presented in this manuscript may encourage other researchers to establish similar systems for oocyte-specific manipulations, which will allow more precise insight into oocyte biology.

      Expertise keywords: spindle dynamics, chromosome segregation, mitosis, meiosis

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      I am commenting on the work of Courtois et al. as an expert in the biochemistry of spindle formation with a focus on acentriolar assembly.

      First and foremost, this a technically excellent study with a number of very interesting and well-documented observations, which are highly relevant for our understanding of the mechanisms of acentriolar spindle formation in the mouse oocyte model. In principle, the manuscript is in a very mature state. However, my major concern at this point would be that there is a break in the story. It starts describing the (very interesting) observation of "central MTOCs". After thoroughly investigating how these behave, the authors stop and look at overall MTOCs distribution after loss of stable MT-kinetochore interactions based on oocytes expressing the Ndc80_9D mutant instead of wt Ndc80. The two parts are experimentally and conceptually not well connected.

      Answering the following questions may help to further develop the paper:

      1. If I understand the arguments correctly, central MTOCs are an "accident" on the way to complete meiosis I spindle formation, which will eventually be corrected and all MTOCs clustered at the poles. Thus, they may serve as an assay for spindle assembly fidelity and kinetics (?). At this point, the reader is left with the observation without efforts to explain the meaning of this observation, ideally experimentally, or at least in a valid discussion.
      2. Enthusiasm for the technically excellent experiments using the Ndc80 variants are somewhat reduced as conclusions from these experiments are published in the parallel paper of the same laboratory (Yoshida et al.). Due to my opinion, it may thus be even more important to connect these observations with the first part described central MTOCs and to clarify their significance.
      3. Shown if in Fig. 3B but not fully explained: How does the distribution of what is defined as central MTOCs behave in Ndc80_wt and Ndc80_9A mutant oocytes? Do the variants differ, i.e. are there fewer, or less persistent central MTOCs in the 9A mutant? Would they differ in kinetics of appearance and "rescue" to the poles?
      4. Similarly: is there a correlation of central MTOC appearance, Ndc80 phosphorylation/stability of kinetochore attachment and Anaphase I onset? The authors mention that oocytes expressing the 9A mutant go faster into Anaphase.
      5. The observation that "central MTOCs exhibited correlated motions with closely positioned kinetochores" is poorly defined, yet an important observation. Does this mean some sort of short k-fibers remain to connect central MTOCs and kinetochores? Wouldn't one expect that the loss of stable end-on-attachment causes MTOCs to become central? How does this fit into a/the model?
      6. Along the same lines: The authors hype their conclusion that kinetochores dominate meiosis I spindle formation based on the observation that loss of kinetochore functions results in less well-organized spindle poles and worse MTOC "confinement". This may mean that kinetochores, together with MTOCs, maintain stable k-fibers in meiosis, as shown here and in Yoshida et al. When one or the other end of k-fibers is destabilized (loss of end-on-attachment, loss of MTOC attachment), the fibers collapse and the remaining minus-or-plus-end associated structure loses its destination. We then see central MTOCs and/or kinetochores at poles. In this respect, the interpretation / discussion should be less "kinetochore-centered".
      7. Is there any way to determine the efficiency of Ndc80 knockdown in the gene replacement respective experiment? I share the view of the authors that their method may be more efficient and may explain apparent discrepancies to previous studies on Ndc80-9A (Guy and Homer, 2013) with more dramatic effects on spindle geometry. However, at that point, this remains speculative. For instance, one may also speculate vice versa that the ko strategy used here is less efficient in a maternally dominated system and leaves behind more wt Ndc80, which better compensates defects seen in the 9A mutant.

      Significance

      Courtois et al present data on mechanisms governing spindle assembly in mouse oocytes. Mouse oocytes serve as model system for spindle formation in the absence of centriole-based MTOCs. At the onset of meiosis I, numerous MTOCs form, which shape a mass ("ball") of MT nucleated around chromatin into a bipolar structure. Accumulating evidence indicates that kinetochores play an important role in acentriolar spindle formation in mouse oocytes, yet the mechanisms behind kinetochore action remains unclear.

      Here, Courtois et al. analyze spindle formation in live mouse oocytes using 3D-time-lapse imaging. They use fluorescently tagged Cep192 to track MTOCs and Histone H2B or CENP-C to visualize chromatin or kinetochores. In the first part, the authors deal with the appearance of "central MTOCs", i.e. aggregates of centrosomal protein(s) that, apparently, fail to remain stably integrated into the spindle pole clusters on MTOCs during spindle formation. The authors convincingly demonstrate that these central MTOCs can be seen in the majority of spindles investigated. They demonstrate that central MTOCs generally come from positions at poles from where they "fall back" towards chromosomes. Central MTOCs may even cross the spindle and end up at opposite poles from where they originated from. Interestingly, central MTOCs are often found next to kinetochores.

      In the second part, the authors focus on the role of kinetochores and their stable MT attachment for spindle formation in general and bipolarity/pole organization in particular. The same lab has published data on the role of kinetochores in meiosis I spindle very recently (Yoshida et al. Nat Comm, 2020). Here, they successfully exploit Ndc80 phospho-mutants to compare MTOC distribution in oocytes with reduced or increased end-on-attachment. The data show that stable end-on attachment determines stable MTOC clustering at spindle poles and governs the maintenance of bipolarity and spindle length.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      This group has been at the forefront recently of using imaging technologies to understand how chromosome segregation is coordinated in mammalian oocytes, and why errors occur. In the current paper they examine the dynamics of microtubule organising centres (which effectively replace centrioles/centrosomes in oocytes) in MI. The imaging of oocytes in this paper is beautiful. The major findings are (1) that MTOCs that are supposed to be at the spindle pole sometimes end up at the spindle equator, and this is documented very beautifully and (2) the correct positioning of MTOCs at the spindle pole appears to require kinetochore microtubules, as indicated by experiments manipulating the kinetochore component NDC80.

      Major Comments

      As such the major claims of the paper are basically well supported. However, the analyses are is almost entirely restricted to prometaphase/metaphase, and the conclusions are relatively limited. The salient omission is any analysis of MTOC/chromosome relationship during anaphase. Were the paper to be extended to determine whether the lingering of MTOCs at the spindle equator is related to chromosome segregation error, that would increase the reach and importance of the work substantially. Specifically:

      1. Can tracking experiments be performed to determine whether the chromosome that shows movement similarities to the errant MTOC is more/less likely to missegregate? Complete tracking as these authors are expert at should achieve this, or photo-labelling the desired chromosome.
      2. Can the position of MTOCs (proportion that linger at the equator) be manipulated in the absence of other defects to determine whether this increases errors (lagging at anaphase, metaphase-II chromosome counting spreads)?
      3. The above analysis would have to be well supported by controls showing that these constructs are having no impact on normal anaphase (proportion of oocytes completing meiosis-I, likelihood of lagging chromosomes etc).
      4. Related to the above, though I appreciate a fixed metaphase image of MTOC immunofluorescence is presented, the paper is about the dynamics of MTOCs and thus nonetheless relies heavily on the live imaging of cep192. The core results should be confirmed using another (substantially different) MTOC probe. This final comment applies to the current metaphase data, regardless of whether the study is ultimately extended

      Significance

      As explained above, as presented this paper is largely scientifically sound, but far more limited in scope than this groups other recent papers. As explained above, the paper would be made more impactful and the readership broadened if a relationship between MTOC position/movement and segregation problems were established. Or on the other hand if it were established why some MTOCs sometimes linger at the spindle equator. Whilst to my knowledge this is the first time that equator MTOCs have been documented so carefully, oocyte cell biologists may not find the core observation that MTOCs are occasionally at the spindle equator extremely surprising.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      We would like to thank Reviewer #1 and #2 for the evaluation of our research and comments to our manuscript. Their comments are highly appreciated and addressed as described below.

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      **Summary:**

      *Provide a short summary of the findings and key conclusions (including methodology and model system(s) where appropriate).*

      Here Ha et al. has further developed their Pumilio RNA tagging methodology for the isolation of UV-crosslinked proteins that are suggested to associate with Xist RNA in mouse embryonic stem cells (mESCs). Within this study the authors claim to have found the Lupus antigen RNA binding protein (La) as a novel Xist interacting partner that influences the efficacy of X-chromosome inactivation (XCI). The authors use a number of different techniques such as qPCR, fluorescent imaging, ATAC-SEQ and SHAPE to show aberration of XCI upon La shRNA knockdown. However, this study has significant flaws in the efficient isolation and validation of Xist associated proteins using their FLAG-out methodology. Furthermore, later experiments predominantly focus on cell death/survival assays, which is somewhat troubling given the essential roles La plays in processes such as cell differentiation and proliferation, ribosome biogenesis, transcriptional control and tRNA maturation. I feel the authors need to robustly address the potential effects La knockdown may be having on their mESCs.

      Reviewer #1 did not fully understand the basic designs of the experimental systems (FLAG-out and iXist), and completely rejected these experimental systems. Reviewer #1 also ignored the majority of the functional analysis on the candidate protein, Ssb. These issues cannot be addressed by additional experiments

      **Major comments:**

      *-Are the key conclusions convincing?*

      My major concern is in their Xist RNA purification.

      First of all, I couldn't find any data on proving the enrichment of Xist RNA itself in their Pumilio pull-down experiment. It would have been useful to show Xist RNA enrichment before benzonase step. Secondly, it is hard to imagine the protocol would successfully isolated Xist RNA-protein complexes from the cell. An earlier report by Clemson et al., (J Cell Biol., 1996) has shown that majority of Xist RNA is still stuck in the nucleus after nuclear matrix prep protocol using detergent, which is not so different from the authors' protocol. Moreover, the authors used UV crosslink, which would have made even harder to purify Xist RNA without sonication. Thirdly, as the tag is located on 5' of Xist RNA, it is rather surprising to see that Spen is not detected in their pulldown. Spen is one of the main functional interactors with Xist, robustly detected by several previous reports. Similarly, other high-affinity binders of Xist such as hnRNP-K and Ciz1 were also lacking from this screen. Finally, the peptides found associated with FLAG-out Xist are extremely low in comparison with other data using glutaraldehyde or formaldehyde crosslinking. For example, HnRNP-M found in Chu et al 2015 has 1120 peptide counts in differentiated cells. The authors here use HnRNP-M as a baseline for specific interactions and show a total of 6 peptide counts in Xist expressing cells and 5 in i-Empty cells (Supplementary excel sheet 1). Similarly, the La protein of interest in this study has 8 counts in i-FLAG-Xist and 6 counts in i-Empty. I struggle to see how this result indicate specific Xist binding. Worryingly this is the starting rationale for the rest of their experiments, it is hard to therefore accept the rest of their conclusions either.

      We have detected Xist RNA after Pumilio pull-down, and added the data in the revised manuscript (Figure S1). The enrichment of Xist RNA by Pumilio pull-down is about 75-fold, comparable to the enrichment reported by Minajigi et al.

      Two out of three previous studies used similar protocols to prep cell lysates for co-IP, including UV cross-linking and detergent (McHugh et al. 2015 and Minajigi et al. 2015). The major difference between their protocols and ours is the co-IP step. They used antisense oligos to pull-down Xist RNA-protein complex, while we take advantage of the specific interaction between PUF and PBS to pull-down Xist RNA-protein complex. With the data in Figure S1, we are confident that our strategy is successful in isolating Xist RNA

      For systematic identification of Xist binding proteins, each method has its own strength and weakness. As we described in the introduction, only 4 proteins were commonly identified by all three studies to systematically identify Xist binding proteins. There is no doubt that our method also missed some authentic Xist binding proteins (false negative) and identified some false positive candidates. Thus, we have to be careful in balancing between the false negative and false positive calls. The reason that we applied the ranking gain to identify Xist binding protein candidates, is to minimize the false negative rate. Meanwhile, we compared our Xist binding protein candidate list with previous identified Xist-binding proteins to enhance the confidence in our candidate lists.

      Regardless the strength and weakness of our method, Ssb is also an Xist-binding protein identified by another study (Chu et al. 2015). More importantly, we have provided experimental validation to confirm Ssb’s involvement in XCI and extensive functional analysis to reveal the protein’s mechanistic role in XCI.

      The other key conclusion the authors make is from the use of numerous cell death/survival assays for both male and female cell lines. This is extremely troubling in the context of assessing their target protein La. La is involved in multiple RNA maturation events of rRNAs, tRNAs and other polIII transcripts. Furthermore, La has been implicated in binding to the mRNA for Cyclin D1 in both human cells and mouse fibroblasts (NIH/3T3 - male) which show a significant effect on cell proliferation upon siRNA knockdown https://www.nature.com/articles/onc2010425. This, along with the observation that La knock-out blastocysts fail to develop any mice or ES cell lines (male or female) show the effect observed in the authors results is most likely not X-linked cell death https://mcb.asm.org/content/mcb/26/4/1445.full.pdf. The authors need to show that their shRNA KD isn't affecting the proliferation and general fitness of their mESC lines.

      The cell death/survival assay was specially designed for analyzing the defect of XCI. The cell death of iXist ESCs upon adding Dox is due to the induction of Xist, which consequently initiates the silencing of the only X chromosome in male cells. Knockdown of genes involved in XCI compromises XCI, thus allowing cell survival. Given the diverse functions of Ssb in cell differentiation and proliferation, ribosome biogenesis, transcriptional control and tRNA maturation, one would expect slow growth and/or cell death of Ssb knockdown cells. Indeed, the result is consistent with our expectation (Figure 2C, without Dox). Nevertheless, more Ssb knockdown cells survive in the presence of Dox, compared with control cells (Figure 2C-E, with Dox), suggesting that Ssb plays an important role in XCI.

      *- Should the authors qualify some of their claims as preliminary or speculative, or remove them altogether?*

      As discussed above, I feel the authors have not clearly demonstrated Xist specific protein enrichment and haven't proven X-linked cell death. Due to the lack of necessary control experiments as discussed below, I feel the notion that La is involved directly in XCI as an RNA chaperone is currently preliminary/speculative.

      The FLAG-out experiment just provided an initial point for the study. We have demonstrated the interaction between Xist and Ssb by RIP. And, Ssb knockdown antagonizes the lethal effect of induced XCI in male cells, allowing more cell to survive. This is contradictory to the diverse house-keeping functions of Ssb, which should lead to slow proliferation or cell death. Therefore, the data here (Figure 2C-E) should suggest a role of Ssb in XCI. In addition, we showed that knockdown of Ssb compromises the silencing of X-linked genes (Figure 2F, 2G, and 3E), the compaction of X chromosome (Figure 3D), Xist cloud formation (Figure 4), epigenetic modifications on Xi (Figure 5), Xist RNA folding (Figure 6F-I), and Xist RNA stability (Figure 7C and D). All these data indicate that Ssb is involved in XCI by regulating Xist RNA folding.

      *- Would additional experiments be essential to support the claims of the paper? Request additional experiments only where necessary for the paper as it is, and do not ask authors to open new lines of experimentation.*

      I would suggest them to show RT-qPCR results of Xist RNA enrichment from the sample after flagIP before benzonase treatment.

      We have the data, and added it to Figure S1.

      Also, it would have been more convincing if their negative control construct (i-Empty) would contain 25 copies of PBSb RNA at least.

      This is a good alternative design of the negative control. Using i-Empty expressing 25 copies of PBSb RNA will allow us subtract the background causing by proteins binding to PBSb RNA. Yet, as discussed above, regardless how we improve the experimental setting, we cannot completely avoid the issue of false positive and false negative. Our goal of the FLAG-out experiment is to generate a list of Xist binding protein candidates, and their binding to Xist and their functions in XCI should be validated by additional experiments. With our current experimental setting, a list of Xist binding protein candidates has been generated, and we have validated the role of Ssb in XCI with subsequent experiments.

      In Fig1b, the total amount of proteins loaded on the gel is not equivalent between two lanes. The gel should show equivalent amounts of proteins on the gel. It looks like if the negative control sample had been loaded at the same amount as the one with Xist, the band pattern wouldn't be distinguishable between the two samples. Furthermore, as these samples were used in the following mass spectrometry screen it may suggest that the minimal increase in peptide counts observed in the iXist FLAG-out were due to an increased amount of sample being loaded? No controls are conducted to account for this.

      IP samples of i-Empty and i-FLAG-Xist were loaded in the gel in Figure 1b. It is expected that IP sample of i-FLAG-Xist should pull down more proteins than IP samples of i-Empty. The FLAG-PUFb bands (the strongest band in each lane) are about the same amount in two samples, indicating roughly equal amount of loading. After normalization of gel loading according to the FLAG-PUFb bands, the upper part of the i-FLAG-Xist lane showed some unique bands.

      For mass spectrometry analysis, the loading of two samples are independent, therefore, to compare the absolute amount of each protein between the two samples does not always provide valuable information. Yet, the relative amount of different proteins within one sample is not affected by the loading amount, thus, more informative. Therefore, we used the ranking information to estimate the relative amount of different proteins in each sample and used the ranking gain to further identify protein candidates.

      The authors quantify cell death in figures 2C - E. It seems clear that shSsb 1 and 2 have an effect on cell count even in the absence of Dox. The rescue effect seen upon Dox addition is minimal when compared to Empty + Dox 2D. The authors ∆A-iXist line with and without Ssb KD/Dox would be an informative control on whether the increase in cell survival that they see is X-linked.

      As the reviewer pointed out earlier, Ssb plays multiple roles in cellular processes. Inevitably, KD of Ssb leads to slow growth and/or cell death with or without Dox. Thus, it is less meaningful to compare the surviving cell counts in Figure 2D. Rather, the survival rate (Figure 2E) reflects the rescuing effect more precisely. Shown in Figure 2E, both shSsb 1 and 2 increase the survival rate significantly, compared with Empty control.

      Moreover, the data in Figure 3B and C demonstrated that Ssb KD compromises the survival of female differentiating cells, but not the survival of male differentiating cells, also indicating a role of Ssb in XCI. With these experiments, it should be sufficient to conclude that Ssb KD affects X-linked cell death/survival in both iXist male ESCs and WT female differentiating cells

      The qPCR results used to validate silencing defects show minor changes in expression and also don't show significant silencing of X-linked genes sufficient for cell death. Could this be because only ~ 50 - 60% of Male iXist cells seem to be expressing in the movies and that this will have an effect on the observed qPCR results? Furthermore, it seems counterintuitive that expression in the Empty male cells increases in 48h compared to 14h. Is this due to cell death and positive selection of cells less able to silence their X-chromosome? How would these data look in the female XX line? How would the data look in a ∆A-iXist line in the presence and absence of shSsb/Dox?

      First, high-quality live-cell imaging can only be carried out for 2 hours with 2-min time interval. The movies are meant to show the onset of Xist RNA signals. Therefore, they were taken one hour after Dox treatment (figure legend of Figure 4B-D). After overnight Dox treatment, Xist clouds can be seen in majority of cells.

      Second, in Fig. 2F-G, we did not include uninduced iXist male ESCs. Therefore, it is impossible to judge whether induction of Xist in this male ESC line results in Xist-dependent silencing at 14 and 48 hr. However, in our previous publication (Li et al., JMB, 2018, 430: 2734-2746), it has been shown that Gpc4, Hprt, Mecp2, G418, and TomatoRed are silenced (4- to 16-fold reduction) at 24 and 48 hours after Dox induction.

      Third, the qRT-PCR results in 14 h and in 48 h are not normalized to the same internal control. Thus, they are not directly comparable.

      Confusingly, the male line in Fig 3C shows a drop in live cell count at day 6 of differentiation? Surely given their previous results in Fig 2 the Ssb KD should increase cell viability with +Dox? Ssb KD seems to have an adverse effect on ES cells during extended differentiation protocols. In Figure S1 the authors show ~ 8 - 10% survival of male lines during differentiation. Could the recombination of the Xist sequence around the loxP sites enable the cells to outcompete the dead cells? How would iEmpty and ∆A-iXist cells compare here? Have the differentiated cells been tested for their expression of Xist? Additionally, how are there similar live cell counts for male vs female lines when ~90% of male cells die during differentiation? Were more cells plated at day 4? If so, this would bias the competition of male cell survival and therefore make the male line an inappropriate control.

      Given the essential role of La during development a control is needed to prove that this death is X-linked in the female 3F1 line. For example, an XO cell line retaining the Cast allele and shSsb expression could show the amount of death caused from shSsb alone independent of X-linked cell death.

      The reviewer completely misunderstood the experiment. The severe cell death specifically observed in female differentiating ESCs is a strong evidence showing Ssb is involved in XCI (Figure 3).

      The male ESCs in Figure 3C is a WT ESC line without the inducible Xist transgene, in which no XCI occurs upon differentiation. It is completely different from iXist male ESCs with Dox, in which forced Xist induction leads to XCI. Thus, the diverse functions of Ssb might contribute to the slight decrease in live cell count of wild type male cells at day 6 of differentiation.

      Figure S2 shows the differentiation of iXist male ESCs with or without Dox. As explained above, forced Xist induction silences the only X chromosome in male cells, resulting in cell death. In addition, XCI occurs more efficiently in differentiation condition (Figure S2) than in pluripotent status (Figure 2C)

      During differentiation, female ESCs silence one X chromosome, and the other X chromosome remains active. KD of Ssb compromises XCI, and two X chromosomes in some female differentiating cells maintain active, leading to cell death. The reviewer is correct that we need a control to rule out that the essential role of Ssb during development affects cell survival and death. An XO cell line can be used as a control. Similarly, a male cell line (XY) is also a good control. We already included a male cell line as a control in Figure 3B and 3C.

      If I understood correctly, the RNA FISH used dsDNA probes ("Sx9") against 40 kb of the X-inactivation centre (Xic). Surely Tsix or other Xic transcripts will also be visible? Can the authors use their RNA FISH to determine the XX or XO status of their cells? In Figure S5 a number of cells appear to show a single pinpoint of transcription. This could either be low levels of Xist transcripts or Xic transcription from an XO line in which the 129 chromosome is missing. It would be best to solely quantify cells which have two x chromosomes and if a significant amount of X chromosomes have been kicked out, this should be discussed and controlled for.

      This is a valid concern, but this concern can be adequately addressed with the available data in the manuscript.

      First, if the female Ssb KD cell line is an “XO” cell line, in which the X129 allele is “kicked out”, the RNA allelotyping results should show an absolute “silencing” of the X129 allele. However, in complete contrast to this notion, RNA allelotyping detected “more” RNA transcripts from X129, showing the chromosome-wide XCI defects (Figure 3D).

      Second, overexpression of Ssb in Ssb KD female cells restores the Xist clouds and the polycomb marks (Figure S8), suggesting that the Ssb KD female cells are XX, but not XO.

      Third, the severe cell death specifically occurred in female Ssb KD lines is also against the “XO” argument (Figure 3B&C).

      In Fig6, the authors generated a number of Ssb constructs for a rescue assay. However, these results complicate the matter and raise more questions than they address. It seems odd that the ∆RRM1 does not rescue based on comparison with their putative negative control, ∆NLS. However, the ∆RRM1 + 2 and ∆LAM do rescue the phenotype better than the full length Ssb? This makes no logical sense and highlights the inherent variation in cell viability these generated cell lines seem to show.

      Following on from this, figure S7 quantifies the GFP tag mRNA levels, depicting all ∆RRM mutants with expression below ~30%? How can ∆RRM1 or 2 be rescuing in this scenario? Have these lines been tested for their XX or XO status? The loss of an X chromosome would lead to a rescue of the cell death phenotype, which is a process known to occur in XX lines that have been cultured for extended periods of time. Could it also be that the cell lines derived are more or less sensitive to exogenous shRNA expression? Also, further validation is needed to assess the efficiency of KD in these lines as theoretically most of these constructs will be targeted by shRNA? What is the endogenous Ssb expression level in these lines? Where in the mRNA sequence are the shRNAs targeted to? Does this make sense on the relative expression levels of ∆RRM1/2 for example? Further testing of GFP expression could also be assessed by quantitative western blot of GFP or even visualised in their RNA FISH/IF samples (Figure S8), currently neither are shown. In addition, some kind of information of stability of each Ssb protein constructs has not been demonstrated.

      Our shRNA targets the LAM domain, so the expression of ∆LAM is not affected by the shRNA. The reviewer is correct that the detected GFP expression levels of ∆RRM1 and ∆RRM2 are too low to be conclusive. We have removed the data point of ∆RRM1 and ∆RRM2. Meanwhile, it is clear that ∆RRM1&2 has a better rescuing effect than ∆NLS, when ∆RRM1&2 and ∆NLS are expressed at similar levels. Ssb is a well known RNA chaperone/RNA helicase. Identifying Ssb is an Xist-binding protein already suggests the functional role of Ssb in XCI. The data of the plasmid rescue experiments further suggests that Ssb is involved in XCI as a RNA chaperone/RNA helicase.

      As for the Western blot and GFP fluorescence (IF), we have tried both. Neither of them detected GFP signal, reflecting the low expression level of these GFP fusion proteins. As the reviewers pointed out that the shSsb is not targeting the 5’ or 3’-UTR region, therefore, interfering the exogenous Ssb as well. This might be a reason for the low expression of these GFP fusion proteins.

      For the data shown in Figure 7A and B the authors quantify the % of cells with Xist signal. The authors have already shown a defect in Xist visualisation in Ssb KD. Surely it is plausible to assume a faster loss of Xist signal below background in weaker expressing cells. A more appropriate quantification would be the % loss of Xist signal per cell over time.

      With Figure 7C and D, the samples have been treated with actinomycin D which globally affects the transcription of cells even the PolIII associated genes Ssb is needed to mature. This treatment could have an added effect on cell mortality and function. Data confirming that actinomycin D doesn't affect the cells disproportionately is needed. The difference in half-life could be attributed to such a treatment.

      We agree with the reviewer that monitoring Xist signal loss per cell would be a better way to analyze the data. However, in Xist signal loss experiment, snapshot images were taken at four time points (1h, 2h, 3h and 4h). This is not a time-lapse imaging. High-quality time-lapse imaging can only be done within a 2-hour time period with 2-min time interval. Therefore, cell-tracking cannot be done in this experiment. In addition, even though Ssb KD slows down the formation of Xist cloud within the early phase (3 hours) of Xist induction (Figure 4), prolonged (overnight) Xist induction leads to Xist cloud formation in a significant fraction of Ssb KD cells, and the Xist cloud signals are about the same in WT and Ssb KD cells (Figure 7A, 0 h). Similarly, qRT-PCR also revealed that Xist RNA are at the same level in WT and Ssb KD cells (Figure 7C, 0 h). These data argue against that a faster loss of Xist signal in Ssb KD cells is due to weaker initial Xist signal.

      Actinomycin D was added at the last 11 hours of the experiment. During this period, no obvious adverse effects on cells were observed.

      In summarising the authors claim that La binds Xist to facilitate folding and appropriate spreading of Xist along the X-chromosome. No direct interaction has been shown, CLIP-seq data would resolve this, however I do understand this is a challenging technique. The authors have instead opted for RIP followed by qPCR (Figure S2). However, this process has a greater potential for non-specific recovery of RNAs via indirect binding. Furthermore, qPCR may also amplify the relative abundance of the RNA detected. As multiple nucleolar proteins came down in the mass spec screen and FLAG-Ssb is being over expressed, it is plausible to assume some transient Xist interactions may arise from nucleolar association at which La will be in high abundance. Positive and negative nuclear RNA controls (e.g. 7SK and U1 snRNA respectively) could be used so to determine the amount of non-specific Protein-RNA interactions in their RIP pull downs. Cytoplasmic actin is not an appropriate control as it is cytosolic.

      We have to clarify one point that the mass spec screen analyzed samples pulled down by FLAG-PUFb, but not FLAG-Ssb.

      We did not intend to distinguish whether Ssb directly binds Xist or is just associated with Xist. RIP followed by qPCR is sufficient to prove the association between Ssb and Xist RNA.

      We can include nuclear RNA as controls, if the reviewer regards RIP as a valid method to show protein and RNA association

      Other than this the authors may want to probe (via IF) for the presence of La accumulation on the X? Many other know factors such as Ciz1, hnrnpK and PRC1/2 complexes show clear accumulation on the X. If I understand correctly, there are many La antibodies on the market and endogenous levels on the X could be assessed. These antibodies may be useful in IP's and pull downs also.

      Many XCI factors play extensive roles in the cell and are not clearly enriched on Xi, including Spen (Moindrot et al. 2015). We have tried the immunostaining and did not detect Ssb’s enrichment on Xi. Ssb shows a general distribution in the nucleus without a clear enrichment on Xi (data not shown).

      *-Are the suggested experiments realistic in terms of time and resources? It would help if you could add an estimated cost and time investment for substantial experiments.*

      The experiments suggested above are centrally focussed on the cell lines that are currently in the authors possession with maybe exceptions with the ∆A-iXist-shSsb line suggested. However, this should be reasonably quick to obtain given their previous work for this paper. Most experiments suggested will focus on the validation of karyotype, Xist expression, rescue construct expression, further RNA FISH classification and repeating more appropriate positive and negative controls for a number of experiments. In theory this can be obtained relatively simply and quickly from current resources. But with the sheer volume of further experiments that are required here, this may take a significant amount of time.

      One vital improvement needed is the replication of mass spec data and the validation of Xist specific recovery and protein enrichment. As it stands this manuscript seems to not have any replicates of the FLAG-out methodology and mass spec data. This is troubling given the poor recovery and specificity of the protein samples obtained. Repeating these experiments would be costly in time and also financially. As it stands, I feel this is essential to conclusively validate their target of interest.

      *- Are the data and the methods presented in such a way that they can be reproduced?*

      The data is presented relatively well, however, it would be beneficial if deailed methods were in the main text and not in a supplementary file. Similarly, more information about the process of differentiation and how cell death/survival was quantified and validated is needed.

      The reviewer rejected the basic design of the experimental system and ignored the majority of the functional analysis data. No additional experiment can address these issues

      We can include more information in the main text, regarding Ssb. However, there is limited space for the main text, various depending on the journals. Meanwhile, the current citation on Ssb is adequate to emphasize that Ssb is a versatile RNA binding protein involved in a variety of fundamental RNA processing events in the cell.

      *- Are the experiments adequately replicated and statistical analysis adequate?*

      In the most part yes, however there seems to be no replicates of the FLAG-out mass spec screen which is worrying given the minimal specificity observed in the current data.

      As we mentioned above, the FLAG-out experiment only serves as a starting point to generate a list of Xist binding protein candidates. Rather than repeating the FLAG-out experiment, we compared the result of FLAG-out to previously published lists of Xist binding protein candidates. More importantly, additional experiments are carried out to validate the Xist binding proteins identified by FLAG-out.

      **Minor comments:**

      *- Specific experimental issues that are easily addressable.*

      Unfortunately, the majority of experimental issues need to be addressed with more robust data which are highlighted above. However, some image analysis, quantification and classification can be amended relatively easily. For example, the live-cell imaging data should be quantified as loss of signal as discussed and RNA FISH should be used to classify XX positive cells and the XO cells can be discarded from analysis.

      We have addressed these issue in the previous sections of this rebuttal.

      *- Are prior studies referenced appropriately?*

      Most papers regarding Xist pull down and biology are discussed and referenced appropriately. However, the role in which La plays during development and its aberrant affects upon KD are seemingly downplayed. I would like to see more discussion of potential defects that could be caused due to globally altering cellular RNA folding.

      We have tried to cite key references about Ssb in development and RNA folding. Due to length limitation, we cannot cite all references in the topic. If necessary, we could discuss the possibility of indirect effect of Ssb KD on XCI through globally altering cellular RNA folding.

      *- Are the text and figures clear and accurate?*

      For the most part, lots of the figures are clear and accurate. Apart from these exceptions.

      1.The Y-axis of Figure 2D is confusing. What does 0.3 as a "sum of area" equate to? 30% of the area was ES cells? This doesn't look to be the case from Fig 2C. Also, how does the intensity of the signal compare? The area may not be a good quantification due to ES cells growing in colonies.

      We have revised the Y-axis labelling of Figure 2D to “sum of area cm2”. Thus, “0.3” means that the area of ESCs is 0.3 cm2. ALPP is highly expressed on ES cell surface. ALPP stain usually produce saturated stains on ES cell colonies. Thoroughly stained ES cell colonies, big and small, show similar signal intensity levels. To analyze the “total signal intensity” will be not much different from “sum of area”.

      2.In the Movies S1-7 there are boxes around certain cells and marked with "Figure 5a - c". This seems to be incorrect as figure 5 is currently the IF staining of polycomb marks. I assume this is in relation to Figure 4b-d?

      We have corrected the labelling mistakes.

      3.Similarly, in Movies S1-7, the intensities of Xist foci seem by eye to be similar. In the paper it is claimed that the Xist clouds that do form are lower in intensity. Are the Movies depicting the same range of pixel intensities? If not, this should be amended. Similarly, figure 7 seems to show relatively equivalent RNA signal at 0 h?

      All the images were collected using a fixed standard of the microscope and camera setting, and these movies depict the same range of pixel intensities. Movies S1-S3 are WT control, and Movies S4-S7 are Ssb KD cells. The Xist cloud signals are weaker in Movie S4-S7 (also quantified in Figure 4E). For the Xist cloud signal, not only the intensity, but also the area of Xist cloud, have to be taken into account.

      The 0 h in Figure 7 is after overnight Dox treatment, and different from the time point in Movies S1-7 (maximum 3 hour Dox treatment, figure legend of Figure 4B-D). The discrepancy can be explained by that knockdown of Ssb only slows down the formation of Xist clouds. After overnight forced expression, the Xist RNA still shows an accumulation in the cells. Figure 7 shows the forced accumulation of Xist RNA after prolonged Dox treatment disappears faster after Dox withdraw.

      4.In figure 4A the data is from female XX cells, this should be highlighted to limit confusion with the male iXist data shown below in 4B-E. It would also be helpful to have the male/female icons (as in figure 3B), for each figure that has images of cells. Currently Figure 4, 5, 7, S5 and S8 are lacking these icons.

      We have revised the labelling on Figure 3, 4, 5, 7 S6 and S9 (S5 and S8 before revision).

      5.No explanation of the Flag-Ssb expression is given for figure S2. Furthermore, is it really necessary to express Flag-Ssb? There are reasonably good antibodies out there for Ssb as this was how it was originally found in Systemic Lupus patients. Also, no data showing the amount of Ssb being overexpressed is shown. This may have big implication to the validity of the RIP-qPCR analysis.

      We could perform qRT-PCR to quantify the overexpression level of Flag-Ssb. If required, we could use Ssb antibody to do Western blot to show the amount of Flag-Ssb protein.

      *- Do you have suggestions that would help the authors improve the presentation of their data and conclusions?

      Most of the data is presented reasonably well, but the robustness of the data somewhat retracts from their conclusions. I feel the certainty of their conclusion regarding Xist specific La binding and RNA chaperone activity is still presumptive and should be rewritten unless more robust data can confirm Xist interaction. I would also suggest deciding on the nomenclature for the protein of interest and use either La or Ssb, the continued use of both through the figures and text can get a little confusing to the reader.

      In the current literatures, Ssb seems to be commonly used as a gene name and La is used as a protein name. We have revised the manuscript to use one name “Ssb” to describe both the gene and the protein.

      Reviewer #1 (Significance (Required)):

      *- Describe the nature and significance of the advance (e.g. conceptual, technical, clinical) for the field.*

      It was a good trial to use PBSb-PUFb system to purify Xist RNA binding proteins, compared to previous reports had used anti-sense oligo purification using complementary sequence to Xist RNA sequences. But currently the purification still needs further validation and repeats to confirm its use. A potential complementary technique could be to isolate Xist directly by using biotinylated probes against the PBSb sequence.

      The authors further claim the identification of a novel Xist RNA chaperone (La/Ssb) which they say facilitates XCI progression. This would be a novel finding in the field; however, the data is currently not robust enough to support this

      *- Place the work in the context of the existing literature (provide references, where appropriate).*

      This work has focused on the development of a milder methodology for purifying Xist RNA during XCI. Others have published similar methodologies predominantly focusing on purifying Xist RNA directly with biotinylated probes (McHugh et al. 2015; Minaji et al. 2015; and Chu et al. 2015). Although this method boasts a milder purification method, it seems to be low yielding in Xist specific proteins. Others have shown a more robust identification of bona fide Xist binding proteins which are currently missing in this manuscript. A recent preprint from the Plath lab has identified new factors involved in XCI during differentiation and their tethering/rescue experiments are far more convincing than the ones shown in this manuscript https://www.biorxiv.org/content/10.1101/2020.03.09.979369v1. The candidate protein Ha et al. have identified has multiple roles in developing cells and has shown to be important during mouse development. However, Ha et al do not robustly show that the knockdown of Ssb causes X-linked cell mortality. Alternatively, as would be presumed from Ssb's essential role in many housekeeping short non-coding RNAs, the cell death seems more ubiquitous upon shRNA KD. Therefore, the link the authors are making here are relatively weak.

      Ssb KD rescues cell death caused by forced induction of Xist in male ESCs. In addition, Ssb KD leads to cell death in differentiating female ESCs, while it has a negligible effect on cell death in differentiating male ESCs. These data clearly demonstrated X-linked cell survival/mortality by Ssb KD.

      Plath lab’s work is different from ours. In their manuscript, the authors report the observation of a protein condensation which is assembled by Xist but sustains in absence of Xist. TDP-43 (a.k.a. Tardbp) happens to be one protein factor involved in the protein condensation and also one candidate protein selected for further validation in our study. In our study, Tardbp KD did not rescue cell death caused by induced XCI in male cells. Thus, Tardbp is not further studied. In the manuscript, we have discussed the possibility that low efficiency of knockdown and redundancy might contribute to the failure in validation of Tardbp

      *- State what audience might be interested in and influenced by the reported findings.*

      The audience may be interested in the novel technique and the finding of a novel Xist binding protein.

      *- Define your field of expertise with a few keywords to help the authors contextualize your point of view. Indicate if there are any parts of the paper that you do not have sufficient expertise to evaluate.*

      RNA biochemistry and developmental biology

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      **Summary:**

      This manuscript describes a novel "FLAG-out" system, where the authors sought to identify Xist RNA binding proteins. The authors focused on a specific protein found in their screen and also identified in several other screens for Xist RNA binding proteins, Ssb/La, and further characterize the role of this protein in XCI. This manuscript describes the loss of Ssb/La and suggest that it predominately impacts the canonical 'cloud' formation of Xist RNA on the X chromosome during XCI initiation. Further, they determine that loss of Ssb/La decreases Xist RNA half-life and alters folding of Xist RNA transcripts. Based on their findings, the authors propose that Ssb/La functions to directly bind and fold Xist RNA transcripts in a manner that stabilizes Xist RNA, allowing for proper 'cloud' formation and successful initiation of XCI.

      **Major comments:**

      The authors made an interesting findings that the SLE-relevant autoantigen Ssb/La stabilizes Xist RNA transcripts, and there is some evidence that this occurs by binding and maintaining proper folding of Xist RNA. Despite these intriguing observations, there are many parts of the manuscript that need to be addressed in order to support the authors main conclusions.

      The most troubling aspect of this manuscript is the persistent use of an artificial XCI system in male cells to draw strong conclusions about the function of Ssb in XCI. This issue is prevalent throughout the manuscript, and I question why the authors chose to perform most of their experiments in male cells when the same experiments can be (and have previously been by other groups) performed in female cells. Using male ESCs and then making conclusions for XCI, which is a female-specific process, is a major concern.

      In addition to iXist male ESC line, many experiments, such as cell death/survival (Figure 3B, C), allelotype (Figure 3E), Xist could formation (Figure 4A), H3K27me3 and H2AK119ub IF (Figure 5), were performed in female ESC. We chose to do SHAPE and Xist RNA stability assays in iXist male ESC line, because the onset of XCI is much more synchronized in this system. Moreover, in female cells, Xa causes additional layers of complication/noise in the ATAC-sequencing which may not be fully cleared up by data analysis. On the other hand, inducible Xist expression in male ESCs can be used as an experimental system to recapitulate the silencing step of XCI (Ha et al. 2018; Wutz et al. 2002).

      • Out of the 138 identified binding proteins, the authors chose to only validate three: Mybbp1a, Tardbp, and Ssb/La. The logic for choosing these candidates is weak, and the authors are only able to validate 1 out of 3 of these proteins.

      In theory, all candidate proteins in the list are possibly involved in XCI. There is no method which can help to make accurate prediction. We did not follow a clear-cut logic in selecting candidates for validation, but we do consider the candidate gene’s knockout phenotype, “early embryonic lethality”, as a phenotype consistent with a critical role of the candidate gene in XCI. Meanwhile, in the manuscript, we have discussed why we chose the three proteins for validation as the following:

      “……From the candidate proteins, we shortlisted three proteins for individual validation. Myb-binding protein 1A (Mybbp1a, Q7TPV4) and TAR DNA-binding protein 43 (Tardbp, Q921F2) were selected because they are known transcription repressors (11, 12). The Lupus autoantigen La (P32067, encoding-gene name: Ssb) was selected because systemic lupus erythematosus (SLE) is an autoimmune disease characterized by a strikingly high female to male ratios of 9:1 (13). Moreover, its autoimmune antigen La is a ubiquitous and versatile RNA-binding protein and a known RNA chaperone (14). All the three selected candidates have also been identified as Xist-binding proteins in previous studies (2, 4). Moreover, the knockout of these three genes all lead to early embryonic death. Tardbp knockout causes embryonic lethality at the blastocyst implantation stage (15). Mybbp1a and Ssb knockout affect blastocyst formation (16, 17). Early embryonic lethality is a mutant phenotype consistent with a critical role of the mutated gene in XCI (1)** ……”

      We used cell death/survival assay to further validate the role of Xist binding protein candidates in XCI. This is a stringent assay. It requires not only that Xist binding protein candidates bind to Xist, but also that the candidates have to be functionally important in XCI.

      Indeed, it has been demonstrated by Plath lab (the BioRxix manuscript mentioned by reviewer 1) that Tardbp (also named TDP-43), together with other RBPs, bind to the E repeat of Xist to form a condensate and create an Xi-domain. Yet, Tardbp KD did not rescue cell death caused by forced XCI in male cells in our studies. Thus, only 1 out of 3 of these candidates is validated and further studied. In the manuscript, we also discussed that low efficiency of knockdown and redundancy might contribute to the failure in validation of Tardbp and Mybbp1a.

      • Use of the cell death assay is not strong enough to "confirm that La is involved in induced XCI" as stated by the authors. This is a huge overstatement.

      Given the diverse functions of Ssb in cell differentiation and proliferation, ribosome biogenesis, transcriptional control and tRNA maturation, one would expect less surviving Ssb knockdown cells. In contrast, more Ssb knockdown cells survives in the presence of Dox, suggesting that Ssb plays an important role in XCI. Considering the reviewer’s comment, we revised the sentence to “further suggest that Ssb is involved in induced XCI”.

      While the authors observed differences in X-linked gene expression after Ssb KD, they did not examine expression of these genes in after KD of either Mybbp1a or Tardbp. Are the changes observed in these genes specific to Ssb KD? Or could there still be alterations of X-linked gene expression in the non-validated KDs? This experiment should be performed and included in the manuscript, either within Fig 2 or in the supplemental. As well, inclusion of a well characterized positive control, for example Hnrnpu, as comparison to Ssb should be included.

      Mybbp1a and Tardbp were not validated by the cell death assay. Thus, compared with Ssb, Mybbp1a and Tardbp are less important for XCI functionally. We only focused on Ssb in the subsequent studies. Mybbp1a and Tardbp KD could be additional negative controls. Yet, we have used empty vector as a negative control. We do not need so many controls.

      As mentioned, Tardbp indeed binds to Xist RNA. It is very likely that Tardbp KD might alter some X-linked gene expression. This rules out Tardbp KD as a good negative control.

      If we do not see any effect of Ssb KD on X-linked gene expression, a positive control is absolutely required. However, we have detected that Ssb KD compromises the silencing of several X-linked gene. A positive control might not be essential.

      • The authors perform RIP to validate the interaction of Ssb with Xist, but this is performed in male ES cells with induced Xist RNA and with FLAG-tagged Ssb. Aside from these cells being male, in this system Xist RNA expression is much higher than would be found endogenously. RIP should have been done in female differentiated ESCs if there is in fact a role for XCI.

      • The authors need to include more details in the methods section to explain how the FLAG-Ssb is expressed in these cells, and why the authors chose to use a tagged contrast over endogenous Ssb. Due to these issues the result from this experiment is essentially meaningless and is not convincing of Ssb interaction with Xist RNA. There is no reason RIP cannot be performed in female cells, and the authors should repeat this experiment in the relevant experimental condition. As well, if a validated Ssb antibody exists the authors should perform RIP using the endogenous protein.

      If required, we could try to perform RIP and/or CLIP using Ssb antibody in female cells.

      The authors state in Fig 3A-C that the results of the cell death and differentiation experiments "...support a functional role of La in XCI". The authors state earlier that Ssb is a ubiquitous protein that is embryonic lethal (in both female and males). Based on this, the cell death results shown do not support a functional role of La in XCI as the Ssb KD could be having an indirect affect due to its other developmental functions. This manuscript lacks a direct functional link between Ssb and XCI; more data is necessary.

      Given the diverse functions of Ssb in cell differentiation and proliferation, ribosome biogenesis, transcriptional control and tRNA maturation, one would expect less surviving Ssb knockdown cells. In contrast, more Ssb knockdown cells survives in the presence of Dox, suggesting that Ssb plays an important role in XCI.

      For the data in Fig 3A-C, Ssb KD causes the death of female differentiating cells, but not male differentiating cells. Therefore, it rules out that the death of female cells is due to the general function of Ssb. Rather, the specific role of Ssb in XCI contributes to the female specific cell death.

      In Fig 3D, the authors perform ATAC-seq in inducible male ES cells. The authors claim that the extremely slight reduction in chromatin compaction of the Ssb KD compared to control iXist "directly connect La to the heterochromatinization of Xi, supporting a functional role of La in XCI". This is also an overstatement based on the minimal, and possibly indirect, change in compaction. The positive control i-detaA-Xist sample has significantly less compaction (and thus significantly higher compaction defect) than the Ssb KD again disputing the claim stated above. It is unclear why performing ATAC-seq is even necessary, as Ssb isn't stated to have a function in regulating chromatin architecture. In addition, why the authors performed ATAC-seq in the artificial male XCI system and not in the F1 female cells, and the N of the experiment is unclear. If the authors want to include the ATAC-seq in further revisions it should be repeated n=3 in the female system.

      The male induced XCI system provides a more synchronized onset of XCI. More importantly, in the male induced XCI system, only one X chromosome exists, avoiding the interference from the active X chromosome in female cells. If ATAC-seq was performed in female cells, only loci with SNPs can be distinguished. The sequencing reads from Xa will create additional layers of complication/noise which may not be cleared up fully by data analysis

      “i-delat-Xist” is a positive control to show the experimental system works. It is not justified to compare the chromatin accessibility of the mutant, which is only a Ssb “knockdown” mutant, and the control “i-delat-Xist”, in which the Repeat A is “deleted”. We admit that ATAC-Seq results did not reveal a drastic difference in chromatin accessibility between the wild type sample and the mutant sample. However, as what we discussed in the manuscript, clear difference can still be seen at the 14 h time point. This is shown clearly by the heatmap (Fig. 3E) and the sequencing coverage profile (Fig. S4A).

      • In Fig 6, the authors state in their methods that "The shRNA construct, which worked efficiently against Ssb, was not designed against the 3' UTR of the RNA. Therefore, the shRNA is against some of the rescue plasmid constructs. Nonetheless, transfecting the Ssb knockdown cells with the rescue plasmids should compensate the effect of Ssb knockdown and serve as a rescue assay to study the functional domains of La.". This is troubling and seems like a major experimental issue; the specific rescue constructs that may be impacted by this issue are not stated and should be explicitly mentioned. This becomes more confusing when examining the data from rescue experiments.

      We pointed out this issue in the original manuscript. We agree that the experiment was not perfectly designed. In the revision, we added in the information on the shRNA target site. Our shRNA targets the LAM domain, so the expression of ∆LAM is not affected by the shRNA. We agree that the detected GFP expression levels of ∆RRM1 and ∆RRM2 are too low to be conclusive. In the revision, we have removed the data point of ∆RRM1 and ∆RRM2. Meanwhile, it is clear that ∆RRM1&2 has a better rescuing effect than ∆NLS, when ∆RRM1&2 and ∆NLS are expressed at similar levels. Ssb is a well-known RNA chaperone/RNA helicase. Identifying Ssb is an Xist-binding protein already suggests the functional role of Ssb in XCI. The data of the plasmid rescue experiments further suggests that Ssb is involved in XCI as a RNA chaperone/RNA helicase.

      If it is necessary, we could redo this experiments using a shSsb targeting 3’-UTR or expressing GFP-Ssb immune to shSsb.

      In Figure S7, the expression of the rescue constructs deltaRRM1 and deltaRRM2 is extremely low, yet the authors observe a rescue of the cloud phenotype (fig 6D) from those constructs that reaches almost the level of full length Ssb. This is confusing, and the authors need to address this by performing a western blot to show the protein levels of these rescue constructs and discuss further how such a low level of expression can show a rescue phenotype. The results would also be stronger if the authors examined H3K27me3 and H2AK119ub1 enrichment since they observed decreased overlap of these marks with Xist RNA after Ssb KD. Finally, the authors state that "...all three RNA-binding domains are required for the functionality of La in XCI..." however I have trouble coming to this conclusion based on the above issues. As well, if the authors want to support direct function, they should repeat the RIP experiments with these rescues constructs to show that the domains capable of rescue can still bind to Xist RNA.

      Reviewer 1 raised similar concerns. In Figure 6C, the live cell counts of ∆RRM1 and ∆NLS are about the same. It might be due to the low expression level of ∆RRM1 (Figure S7). It is clear that ∆RRM1&2 has a better rescuing effect than ∆NLS, when ∆RRM1&2 and ∆NLS are expressed as similar levels. To make the data more straight forward, we removed the data point of ∆RRM1 and ∆RRM2, because of their low expression levels.

      As for the Western blot and GFP fluorescence (IF), we have tried both. Neither of them detected GFP signal, reflecting the low expression level of these GFP fusion proteins. The shSsb is not targeting the 5’ or 3’-UTR region, therefore interfering the exogenous Ssb as well. This might be a reason for the low expression of these GFP fusion proteins. If it is necessary, we could redo this experiments using a shSsb targeting 3’–UTR or expressing GFP-Ssb immune to shSsb.

      We deleted the sentence "all three RNA-binding domains are required for the functionality of La in XCI".

      **Minor comments:**

      The authors may want to consider better highlighting the strengths of their "FLAG-out" system. As written, is it difficult to tell how this system sets them apart from the previously published studies referenced in the text, especially as some of these studies used similar crosslinking conditions and cell types. Additionally, the logic and questions the authors pose in the introduction as to why they performed this project are too general and not very strong. For example, the authors mention how might protein machinery may assemble on Xist RNA, and how might Xist RNA may spread on the X chromosome. However neither of these topics are actually addressed in their experiments or discussion. These are interesting questions, but the authors should either discuss them further within the context of their results or take these questions out. It would also be helpful if the authors could better label Figure 4, as it is unclear in the figure itself that Fig 4A is in reference to female cells, but remaining panels are in male cells.

      The inducible XCI in male cells is a valid system to recapitulate the silencing step of XCI. It also provides unique advantages in many experiments, such as ATAC-seq. Meanwhile, we did perform extensive functional analysis on the endogenous XCI process using female cells. However, we do realize that presenting the data of induced XCI in male cells together with the data from female cells is confusing to many readers. We have revised the labelling on Figure 3, 4, 5, 7 S6 and S9 (S5 and S8 before revision).

      To understand “how the protein machinery is assembled by Xist” and “how Xist spreads along its host chromosome territory” are not specifically the initial aims of this study. We removed the sentences from the introduction section. However, we believe Ssb may provide clues for the future studies to fully address these questions, and we did provide the following thoughts in the discussion section:

      “……Secondly, as Ssb is able to utilize ATP to unwind RNA-RNA and RNA-DNA duplex, it may play a more active role in controlling the structural dynamics of Xist in living cells (14, 23). These structural dynamics may be important for recruiting proteins onto the RNA and spreading of the RNA along its host chromosome territory……”

      Reviewer #2 (Significance (Required)):

      I am not convinced the this manuscript, as written, has sufficient novelty. Ssb/La has been previously identified to be an Xist RNA binding protein with older/different approaches. However, there are some interesting observations in this manuscript. Major revisions are necessary.

      We agree with the reviewer that identification of Ssb as an Xist RNA binding protein is not novel. The novelty of our discovery lies in: 1) we developed a new method for isolating lincRNA associated proteins; 2) we confirmed that Ssb is an important player involved in XCI; 3) we showed that Ssb regulates the folding of Xist RNA, consequently the stability of Xist and the formation of Xist cloud.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary:

      This manuscript describes a novel "FLAG-out" system, where the authors sought to identify Xist RNA binding proteins. The authors focused on a specific protein found in their screen and also identified in several other screens for Xist RNA binding proteins, Ssb/La, and further characterize the role of this protein in XCI. This manuscript describes the loss of Ssb/La and suggest that it predominately impacts the canonical 'cloud' formation of Xist RNA on the X chromosome during XCI initiation. Further, they determine that loss of Ssb/La decreases Xist RNA half-life and alters folding of Xist RNA transcripts. Based on their findings, the authors propose that Ssb/La functions to directly bind and fold Xist RNA transcripts in a manner that stabilizes Xist RNA, allowing for proper 'cloud' formation and successful initiation of XCI.

      Major comments:

      The authors made an interesting findings that the SLE-relevant autoantigen Ssb/La stabilizes Xist RNA transcripts, and there is some evidence that this occurs by binding and maintaining proper folding of Xist RNA. Despite these intriguing observations, there are many parts of the manuscript that need to be addressed in order to support the authors main conclusions.

      • The most troubling aspect of this manuscript is the persistent use of an artificial XCI system in male cells to draw strong conclusions about the function of Ssb in XCI. This issue is prevalent throughout the manuscript, and I question why the authors chose to perform most of their experiments in male cells when the same experiments can be (and have previously been by other groups) performed in female cells. Using male ESCs and then making conclusions for XCI, which is a female-specific process, is a major concern.

      • Out of the 138 identified binding proteins, the authors chose to only validate three: Mybbp1a, Tardbp, and Ssb/La. The logic for choosing these candidates is weak, and the authors are only able to validate 1 out of 3 of these proteins.

      • Use of the cell death assay is not strong enough to "confirm that La is involved in induced XCI" as stated by the authors. This is a huge overstatement.

      • While the authors observed differences in X-linked gene expression after Ssb KD, they did not examine expression of these genes in after KD of either Mybbp1a or Tardbp. Are the changes observed in these genes specific to Ssb KD? Or could there still be alterations of X-linked gene expression in the non-validated KDs? This experiment should be performed and included in the manuscript, either within Fig 2 or in the supplemental. As well, inclusion of a well characterized positive control, for example Hnrnpu, as comparison to Ssb should be included.

      • The authors perform RIP to validate the interaction of Ssb with Xist, but this is performed in male ES cells with induced Xist RNA and with FLAG-tagged Ssb. Aside from these cells being male, in this system Xist RNA expression is much higher than would be found endogenously. RIP should have been done in female differentiated ESCs if there is in fact a role for XCI.

      • The authors need to include more details in the methods section to explain how the FLAG-Ssb is expressed in these cells, and why the authors chose to use a tagged contrast over endogenous Ssb. Due to these issues the result from this experiment is essentially meaningless and is not convincing of Ssb interaction with Xist RNA. There is no reason RIP cannot be performed in female cells, and the authors should repeat this experiment in the relevant experimental condition. As well, if a validated Ssb antibody exists the authors should perform RIP using the endogenous protein.

      • The authors state in Fig 3A-C that the results of the cell death and differentiation experiments "...support a functional role of La in XCI". The authors state earlier that Ssb is a ubiquitous protein that is embryonic lethal (in both female and males). Based on this, the cell death results shown do not support a functional role of La in XCI as the Ssb KD could be having an indirect affect due to its other developmental functions. This manuscript lacks a direct functional link between Ssb and XCI; more data is necessary.

      • In Fig 3D, the authors perform ATAC-seq in inducible male ES cells. The authors claim that the extremely slight reduction in chromatin compaction of the Ssb KD compared to control iXist "directly connect La to the heterochromatinization of Xi, supporting a functional role of La in XCI". This is also an overstatement based on the minimal, and possibly indirect, change in compaction. The positive control i-detaA-Xist sample has significantly less compaction (and thus significantly higher compaction defect) than the Ssb KD again disputing the claim stated above. It is unclear why performing ATAC-seq is even necessary, as Ssb isn't stated to have a function in regulating chromatin architecture. In addition, why the authors performed ATAC-seq in the artificial male XCI system and not in the F1 female cells, and the N of the experiment is unclear. If the authors want to include the ATAC-seq in further revisions it should be repeated n=3 in the female system.

      • In Fig 6, the authors state in their methods that "The shRNA construct, which worked efficiently against Ssb, was not designed against the 3' UTR of the RNA. Therefore, the shRNA is against some of the rescue plasmid constructs. Nonetheless, transfecting the Ssb knockdown cells with the rescue plasmids should compensate the effect of Ssb knockdown and serve as a rescue assay to study the functional domains of La.". This is troubling and seems like a major experimental issue; the specific rescue constructs that may be impacted by this issue are not stated and should be explicitly mentioned. This becomes more confusing when examining the data from rescue experiments.

      • In Figure S7, the expression of the rescue constructs deltaRRM1 and deltaRRM2 is extremely low, yet the authors observe a rescue of the cloud phenotype (fig 6D) from those constructs that reaches almost the level of full length Ssb. This is confusing, and the authors need to address this by performing a western blot to show the protein levels of these rescue constructs and discuss further how such a low level of expression can show a rescue phenotype. The results would also be stronger if the authors examined H3K27me3 and H2AK119ub1 enrichment since they observed decreased overlap of these marks with Xist RNA after Ssb KD. Finally, the authors state that "...all three RNA-binding domains are required for the functionality of La in XCI..." however I have trouble coming to this conclusion based on the above issues. As well, if the authors want to support direct function, they should repeat the RIP experiments with these rescues constructs to show that the domains capable of rescue can still bind to Xist RNA.

      Minor comments:

      The authors may want to consider better highlighting the strengths of their "FLAG-out" system. As written, is it difficult to tell how this system sets them apart from the previously published studies referenced in the text, especially as some of these studies used similar crosslinking conditions and cell types. Additionally, the logic and questions the authors pose in the introduction as to why they performed this project are too general and not very strong. For example, the authors mention how might protein machinery may assemble on Xist RNA, and how might Xist RNA may spread on the X chromosome. However neither of these topics are actually addressed in their experiments or discussion. These are interesting questions, but the authors should either discuss them further within the context of their results or take these questions out. It would also be helpful if the authors could better label Figure 4, as it is unclear in the figure itself that Fig 4A is in reference to female cells, but remaining panels are in male cells.

      Significance

      I am not convinced the this manuscript, as written, has sufficient novelty. Ssb/La has been previously identified to be an Xist RNA binding protein with older/different approaches. However, there are some interesting observations in this manuscript. Major revisions are necessary.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary:

      Provide a short summary of the findings and key conclusions (including methodology and model system(s) where appropriate).

      Here Ha et al. has further developed their Pumilio RNA tagging methodology for the isolation of UV-crosslinked proteins that are suggested to associate with Xist RNA in mouse embryonic stem cells (mESCs). Within this study the authors claim to have found the Lupus antigen RNA binding protein (La) as a novel Xist interacting partner that influences the efficacy of X-chromosome inactivation (XCI). The authors use a number of different techniques such as qPCR, fluorescent imaging, ATAC-SEQ and SHAPE to show aberration of XCI upon La shRNA knockdown. However, this study has significant flaws in the efficient isolation and validation of Xist associated proteins using their FLAG-out methodology. Furthermore, later experiments predominantly focus on cell death/survival assays, which is somewhat troubling given the essential roles La plays in processes such as cell differentiation and proliferation, ribosome biogenesis, transcriptional control and tRNA maturation. I feel the authors need to robustly address the potential effects La knockdown may be having on their mESCs.

      Major comments:

      -Are the key conclusions convincing?

      My major concern is in their Xist RNA purification. First of all, I couldn't find any data on proving the enrichment of Xist RNA itself in their Pumilio pull-down experiment. It would have been useful to show Xist RNA enrichment before benzonase step. Secondly, it is hard to imagine the protocol would successfully isolated Xist RNA-protein complexes from the cell. An earlier report by Clemson et al., (J Cell Biol., 1996) has shown that majority of Xist RNA is still stuck in the nucleus after nuclear matrix prep protocol using detergent, which is not so different from the authors' protocol. Moreover, the authors used UV crosslink, which would have made even harder to purify Xist RNA without sonication. Thirdly, as the tag is located on 5' of Xist RNA, it is rather surprising to see that Spen is not detected in their pulldown. Spen is one of the main functional interactors with Xist, robustly detected by several previous reports. Similarly, other high-affinity binders of Xist such as hnRNP-K and Ciz1 were also lacking from this screen. Finally, the peptides found associated with FLAG-out Xist are extremely low in comparison with other data using glutaraldehyde or formaldehyde crosslinking. For example, HnRNP-M found in Chu et al 2015 has 1120 peptide counts in differentiated cells. The authors here use HnRNP-M as a baseline for specific interactions and show a total of 6 peptide counts in Xist expressing cells and 5 in i-Empty cells (Supplementary excel sheet 1). Similarly, the La protein of interest in this study has 8 counts in i-FLAG-Xist and 6 counts in i-Empty. I struggle to see how this result indicate specific Xist binding. Worryingly this is the starting rationale for the rest of their experiments, it is hard to therefore accept the rest of their conclusions either.

      The other key conclusion the authors make is from the use of numerous cell death/survival assays for both male and female cell lines. This is extremely troubling in the context of assessing their target protein La. La is involved in multiple RNA maturation events of rRNAs, tRNAs and other polIII transcripts. Furthermore, La has been implicated in binding to the mRNA for Cyclin D1 in both human cells and mouse fibroblasts (NIH/3T3 - male) which show a significant effect on cell proliferation upon siRNA knockdown https://www.nature.com/articles/onc2010425. This, along with the observation that La knock-out blastocysts fail to develop any mice or ES cell lines (male or female) show the effect observed in the authors results is most likely not X-linked cell death https://mcb.asm.org/content/mcb/26/4/1445.full.pdf. The authors need to show that their shRNA KD isn't affecting the proliferation and general fitness of their mESC lines.

      - Should the authors qualify some of their claims as preliminary or speculative, or remove them altogether?

      As discussed above, I feel the authors have not clearly demonstrated Xist specific protein enrichment and haven't proven X-linked cell death. Due to the lack of necessary control experiments as discussed below, I feel the notion that La is involved directly in XCI as an RNA chaperone is currently preliminary/speculative.

      - Would additional experiments be essential to support the claims of the paper? Request additional experiments only where necessary for the paper as it is, and do not ask authors to open new lines of experimentation.

      I would suggest them to show RT-qPCR results of Xist RNA enrichment from the sample after flagIP before benzonase treatment.

      Also, it would have been more convincing if their negative control construct (i-Empty) would contain 25 copies of PBSb RNA at least.

      In Fig1b, the total amount of proteins loaded on the gel is not equivalent between two lanes. The gel should show equivalent amounts of proteins on the gel. It looks like if the negative control sample had been loaded at the same amount as the one with Xist, the band pattern wouldn't be distinguishable between the two samples. Furthermore, as these samples were used in the following mass spectrometry screen it may suggest that the minimal increase in peptide counts observed in the iXist FLAG-out were due to an increased amount of sample being loaded? No controls are conducted to account for this.

      The authors quantify cell death in figures 2C - E. It seems clear that shSsb 1 and 2 have an effect on cell count even in the absence of Dox. The rescue effect seen upon Dox addition is minimal when compared to Empty + Dox 2D. The authors ∆A-iXist line with and without Ssb KD/Dox would be an informative control on whether the increase in cell survival that they see is X-linked.

      The qPCR results used to validate silencing defects show minor changes in expression and also don't show significant silencing of X-linked genes sufficient for cell death. Could this be because only ~ 50 - 60% of Male iXist cells seem to be expressing in the movies and that this will have an effect on the observed qPCR results? Furthermore, it seems counterintuitive that expression in the Empty male cells increases in 48h compared to 14h. Is this due to cell death and positive selection of cells less able to silence their X-chromosome? How would these data look in the female XX line? How would the data look in a ∆A-iXist line in the presence and absence of shSsb/Dox?

      Confusingly, the male line in Fig 3C shows a drop in live cell count at day 6 of differentiation? Surely given their previous results in Fig 2 the Ssb KD should increase cell viability with +Dox? Ssb KD seems to have an adverse effect on ES cells during extended differentiation protocols. In Figure S1 the authors show ~ 8 - 10% survival of male lines during differentiation. Could the recombination of the Xist sequence around the loxP sites enable the cells to outcompete the dead cells? How would iEmpty and ∆A-iXist cells compare here? Have the differentiated cells been tested for their expression of Xist? Additionally, how are there similar live cell counts for male vs female lines when ~90% of male cells die during differentiation? Were more cells plated at day 4? If so, this would bias the competition of male cell survival and therefore make the male line an inappropriate control. Given the essential role of La during development a control is needed to prove that this death is X-linked in the female 3F1 line. For example, an XO cell line retaining the Cast allele and shSsb expression could show the amount of death caused from shSsb alone independent of X-linked cell death.

      If I understood correctly, the RNA FISH used dsDNA probes ("Sx9") against 40 kb of the X-inactivation centre (Xic). Surely Tsix or other Xic transcripts will also be visible? Can the authors use their RNA FISH to determine the XX or XO status of their cells? In Figure S5 a number of cells appear to show a single pinpoint of transcription. This could either be low levels of Xist transcripts or Xic transcription from an XO line in which the 129 chromosome is missing. It would be best to solely quantify cells which have two x chromosomes and if a significant amount of X chromosomes have been kicked out, this should be discussed and controlled for.

      In Fig6, the authors generated a number of Ssb constructs for a rescue assay. However, these results complicate the matter and raise more questions than they address. It seems odd that the ∆RRM1 does not rescue based on comparison with their putative negative control, ∆NLS. However, the ∆RRM1 + 2 and ∆LAM do rescue the phenotype better than the full length Ssb? This makes no logical sense and highlights the inherent variation in cell viability these generated cell lines seem to show. Following on from this, figure S7 quantifies the GFP tag mRNA levels, depicting all ∆RRM mutants with expression below ~30%? How can ∆RRM1 or 2 be rescuing in this scenario? Have these lines been tested for their XX or XO status? The loss of an X chromosome would lead to a rescue of the cell death phenotype, which is a process known to occur in XX lines that have been cultured for extended periods of time. Could it also be that the cell lines derived are more or less sensitive to exogenous shRNA expression? Also, further validation is needed to assess the efficiency of KD in these lines as theoretically most of these constructs will be targeted by shRNA? What is the endogenous Ssb expression level in these lines? Where in the mRNA sequence are the shRNAs targeted to? Does this make sense on the relative expression levels of ∆RRM1/2 for example? Further testing of GFP expression could also be assessed by quantitative western blot of GFP or even visualised in their RNA FISH/IF samples (Figure S8), currently neither are shown. In addition, some kind of information of stability of each Ssb protein constructs has not been demonstrated.

      For the data shown in Figure 7A and B the authors quantify the % of cells with Xist signal. The authors have already shown a defect in Xist visualisation in Ssb KD. Surely it is plausible to assume a faster loss of Xist signal below background in weaker expressing cells. A more appropriate quantification would be the % loss of Xist signal per cell over time.

      With Figure 7C and D, the samples have been treated with actinomycin D which globally affects the transcription of cells even the PolIII associated genes Ssb is needed to mature. This treatment could have an added effect on cell mortality and function. Data confirming that actinomycin D doesn't affect the cells disproportionately is needed. The difference in half-life could be attributed to such a treatment.

      In summarising the authors claim that La binds Xist to facilitate folding and appropriate spreading of Xist along the X-chromosome. No direct interaction has been shown, CLIP-seq data would resolve this, however I do understand this is a challenging technique. The authors have instead opted for RIP followed by qPCR (Figure S2). However, this process has a greater potential for non-specific recovery of RNAs via indirect binding. Furthermore, qPCR may also amplify the relative abundance of the RNA detected. As multiple nucleolar proteins came down in the mass spec screen and FLAG-Ssb is being over expressed, it is plausible to assume some transient Xist interactions may arise from nucleolar association at which La will be in high abundance. Positive and negative nuclear RNA controls (e.g. 7SK and U1 snRNA respectively) could be used so to determine the amount of non-specific Protein-RNA interactions in their RIP pull downs. Cytoplasmic actin is not an appropriate control as it is cytosolic.

      Other than this the authors may want to probe (via IF) for the presence of La accumulation on the X? Many other know factors such as Ciz1, hnrnpK and PRC1/2 complexes show clear accumulation on the X. If I understand correctly, there are many La antibodies on the market and endogenous levels on the X could be assessed. These antibodies may be useful in IP's and pull downs also.

      -Are the suggested experiments realistic in terms of time and resources? It would help if you could add an estimated cost and time investment for substantial experiments.

      The experiments suggested above are centrally focussed on the cell lines that are currently in the authors possession with maybe exceptions with the ∆A-iXist-shSsb line suggested. However, this should be reasonably quick to obtain given their previous work for this paper. Most experiments suggested will focus on the validation of karyotype, Xist expression, rescue construct expression, further RNA FISH classification and repeating more appropriate positive and negative controls for a number of experiments. In theory this can be obtained relatively simply and quickly from current resources. But with the sheer volume of further experiments that are required here, this may take a significant amount of time. One vital improvement needed is the replication of mass spec data and the validation of Xist specific recovery and protein enrichment. As it stands this manuscript seems to not have any replicates of the FLAG-out methodology and mass spec data. This is troubling given the poor recovery and specificity of the protein samples obtained. Repeating these experiments would be costly in time and also financially. As it stands, I feel this is essential to conclusively validate their target of interest.

      - Are the data and the methods presented in such a way that they can be reproduced?

      The data is presented relatively well, however, it would be beneficial if deailed methods were in the main text and not in a supplementary file. Similarly, more information about the process of differentiation and how cell death/survival was quantified and validated is needed.

      - Are the experiments adequately replicated and statistical analysis adequate?

      In the most part yes, however there seems to be no replicates of the FLAG-out mass spec screen which is worrying given the minimal specificity observed in the current data.

      Minor comments:

      - Specific experimental issues that are easily addressable.

      Unfortunately, the majority of experimental issues need to be addressed with more robust data which are highlighted above. However, some image analysis, quantification and classification can be amended relatively easily. For example, the live-cell imaging data should be quantified as loss of signal as discussed and RNA FISH should be used to classify XX positive cells and the XO cells can be discarded from analysis.

      - Are prior studies referenced appropriately?

      Most papers regarding Xist pull down and biology are discussed and referenced appropriately. However, the role in which La plays during development and its aberrant affects upon KD are seemingly downplayed. I would like to see more discussion of potential defects that could be caused due to globally altering cellular RNA folding.

      - Are the text and figures clear and accurate?

      For the most part, lots of the figures are clear and accurate. Apart from these exceptions.

      1.The Y-axis of Figure 2D is confusing. What does 0.3 as a "sum of area" equate to? 30% of the area was ES cells? This doesn't look to be the case from Fig 2C. Also, how does the intensity of the signal compare? The area may not be a good quantification due to ES cells growing in colonies.

      2.In the Movies S1-7 there are boxes around certain cells and marked with "Figure 5a - c". This seems to be incorrect as figure 5 is currently the IF staining of polycomb marks. I assume this is in relation to Figure 4b-d?

      3.Similarly, in Movies S1-7, the intensities of Xist foci seem by eye to be similar. In the paper it is claimed that the Xist clouds that do form are lower in intensity. Are the Movies depicting the same range of pixel intensities? If not, this should be amended. Similarly, figure 7 seems to show relatively equivalent RNA signal at 0 h?

      4.In figure 4A the data is from female XX cells, this should be highlighted to limit confusion with the male iXist data shown below in 4B-E. It would also be helpful to have the male/female icons (as in figure 3B), for each figure that has images of cells. Currently Figure 4, 5, 7, S5 and S8 are lacking these icons.

      5.No explanation of the Flag-Ssb expression is given for figure S2. Furthermore, is it really necessary to express Flag-Ssb? There are reasonably good antibodies out there for Ssb as this was how it was originally found in Systemic Lupus patients. Also, no data showing the amount of Ssb being overexpressed is shown. This may have big implication to the validity of the RIP-qPCR analysis.

      - Do you have suggestions that would help the authors improve the presentation of their data and conclusions?

      Most of the data is presented reasonably well, but the robustness of the data somewhat retracts from their conclusions. I feel the certainty of their conclusion regarding Xist specific La binding and RNA chaperone activity is still presumptive and should be rewritten unless more robust data can confirm Xist interaction. I would also suggest deciding on the nomenclature for the protein of interest and use either La or Ssb, the continued use of both through the figures and text can get a little confusing to the reader.

      Significance

      - Describe the nature and significance of the advance (e.g. conceptual, technical, clinical) for the field.

      It was a good trial to use PBSb-PUFb system to purify Xist RNA binding proteins, compared to previous reports had used anti-sense oligo purification using complementary sequence to Xist RNA sequences. But currently the purification still needs further validation and repeats to confirm its use. A potential complementary technique could be to isolate Xist directly by using biotinylated probes against the PBSb sequence. The authors further claim the identification of a novel Xist RNA chaperone (La/Ssb) which they say facilitates XCI progression. This would be a novel finding in the field; however, the data is currently not robust enough to support this.

      - Place the work in the context of the existing literature (provide references, where appropriate).

      This work has focused on the development of a milder methodology for purifying Xist RNA during XCI. Others have published similar methodologies predominantly focusing on purifying Xist RNA directly with biotinylated probes (McHugh et al. Minaji et al and Chu et al.). Although this method boasts a milder purification method, it seems to be low yielding in Xist specific proteins. Others have shown a more robust identification of bona fide Xist binding proteins which are currently missing in this manuscript. A recent preprint from the Plath lab has identified new factors involved in XCI during differentiation and their tethering/rescue experiments are far more convincing than the ones shown in this manuscript https://www.biorxiv.org/content/10.1101/2020.03.09.979369v1. The candidate protein Ha et al have identified has multiple roles in developing cells and has shown to be important during mouse development. However, Ha et al do not robustly show that the knockdown of Ssb causes X-linked cell mortality. Alternatively, as would be presumed from Ssb's essential role in many housekeeping short non-coding RNAs, the cell death seems more ubiquitous upon shRNA KD. Therefore, the link the authors are making here are relatively weak.

      - State what audience might be interested in and influenced by the reported findings.

      The audience may be interested in the novel technique and the finding of a novel Xist binding protein.

      - Define your field of expertise with a few keywords to help the authors contextualize your point of view. Indicate if there are any parts of the paper that you do not have sufficient expertise to evaluate.

      RNA biochemistry and developmental biology

    1. Transparent Peer Review


      Download the complete Review Process [PDF] including:

      • reviews
      • authors' reply
      • editorial decisions

      </br>

    2. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      General comments

      We thank all three reviewers for providing their thoughtful and insightful review comments of our manuscript. We appreciate that the reviewers recognized the significance and impact of our work - “Very little imaging has been done on CAR synapses and to our knowledge this is the first live cell imaging study describing CAR microclustsers” (Reviewer 2); “This is an evolving field and little is known to date. Hence, this study could represent an insightful and important advance to the field” (Reviewer 3). A broad audience from both basic and clinical research sides will be interested in this work: “_This study will have a broad audience. Both scientists that study basic T cell signaling as well as clinicians that use CAR Ts will be interested in this study” (_Reviewer 2); “Audience is to both basic immunologist and cancer biologists” (Reviewer 3).

      Meanwhile, we understand that the reviewers have raised a few major and minor issues, which we attempted to address. Most importantly, as suggested by both reviewer 1 and 3, we performed new experiments showing that LAT is not required for microcluster formation of the 1st generation of CAR (new Fig 4 and EV5). This finding suggests that the CAR-independent signaling is due to the intrinsic CAR architecture, and is not dependent on the co-signaling domains of CD28 and 4-1BB.

      With the successful solutions to other issues, we believe the manuscript has been significantly improved and is ready for publication. Below we will provide point-to-point responses to each reviewer’s comments.

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      The authors compare the TCR alone to a CAR that contains signaling modules from three receptors- TCR, CD28 and 41BB. The data quality if good and the experiments done are. The difference is quite clear, and I would even like to see a little more of the evidence related to failure of the TCR system.

      We appreciate the general positive comment of this reviewer.

      More specifically:

      Su and colleagues show that a third generation CAR with TCR zeta, CD28 and 41BB signal transduction pathways can activate a T cell for microcluster formation and Gads/SLP-76 recruitment, but not IL-2 production, without LAT. This is surprising because LAT is generally considered, as is up held here, as an essential adapter protein for T cell activation. However, this is not a "fair" experiment as the CAR has sequences from TCR, and two co-stimulatory receptor- CD28 and 41BB. It would be important and very straight-forward to test first and second generation CARs to determine if LAT independence is a function of the CAR architecture itself, or the additional costimulatory sequences. If it turns out that a first generation CAR with only TCR sequences can trigger LAT independent clustering and SLP-76 recruitment then the comparison would be fair and no additional experiment would be needed to make the point that the CAR architecture is intrinsically LAT independent. If the CD28 and/or 41BB sequences are needed for LAT independence then the fair comparison would be to co-crosslink TCR, CD28 and 41BB (an inducible costimulator such that anti-CD27 might be substituted to have a constitutively expressed receptor with this similar motifs) should be cross-linked with the TCR to make this a fair comparison between the two architectures.

      We agree with the reviewer that it is critical to make a “fair” comparison between TCR and CAR by testing the 1st generation CAR, which only contains the TCR/CD3z domain. Our new data showed that LAT is not required for microcluster and synapse formation of the 1st generation of CAR, in both Jurkat and primary T cells (new Fig 4 and EV5). This result is similar to our previously reported result from the 3rd generation CAR, although the 1st generation CAR induced less IL-2 production and CD69 expression in LAT null cells than the 3rd generation CAR did (new Fig 6). This suggests that the LAT-independent signaling is intrinsic to the CAR architecture, as the reviewer suggested. The co-signaling domains from CD28 and 4-1BB contribute to, but are not required for bypassing LAT to transduce the CAR signaling.

      The authors may want to cite work from Vignali and colleagues that even the TCR has two signaling modules- the classical ZAP-70/LAT module that is responsible to IL-2 and a Vav/Notch dependent module that controls proliferation. Its not clear to me that the issue raised about distinct signaling by CARs is completely parallel to this, but its interesting that Vignali also associated the classical TCR signaling pathway as responsible for IL-2 with an alterive pathways that uses the same ITAMs to control distinct functions. See Guy CS, Vignali KM, Temirov J, Bettini ML, Overacre AE, Smeltzer M, Zhang H, Huppa JB, Tsai YH, Lobry C, Xie J, Dempsey PJ, Crawford HC, Aifantis I, Davis MM, Vignali DA. Distinct TCR signaling pathways drive proliferation and cytokine production in T cells. Nat Immunol. 2013;14(3):262-70.

      We appreciate the reviewer’s mentioning this paper from Vignali’s group. It provides insights into understanding LAT-independent signaling in CAR T cells. We cited this paper and added a discussion about the mechanism of LAT-independent signaling.

      I would be very interested to see a movie of the LAT deficient T cells interacting with the anti-CD3 coated bilayers in Figure 2A. Since OKT3 has a high affinity for CD3 and is coated on the surface at a density that should engage anti-CD3 I'm surprised there is no clustering even simply based on mass action. The result looks almost like a dominant negative effect of LAT deficiency on a high affinity extracellular interaction. It would be interesting to see how this interface evolves or if there is anti-adhesive behavior that emerges.

      We now presented a movie showing the detailed process of LAT deficient GFP-CAR T cells landing on the bilayers coated with OKT3 (new Movie EV5), in which the bright field images delineate the locations of the cells, the OKT3 signal marks TCR, and the GFP signal marks CAR proteins on the plasma membranes. No TCR clusters (as indicated by OKT3) were formed during the landing process. We think the binding of bilayer-presented OKT3 to TCR is not sufficient to trigger TCR microclusters. However, TCR microclusters could form in LAT-deficient cells if OKT3 is presented by glass surface. This point is raised by reviewer 2. We added a discussion on the difference between bilayer and glass-presented OKT3 in inducing microcluster formation.

      Reviewer #1 (Significance (Required)):

      While it interesting that the CAR is LAT independent, its obvious that the signalling networks are different as the CAR has two sets of motifs that are absent in the TCR, so the experiments as presented are not that insightful about the specific nature of the differences that lead to the different outcomes. At present its not a particularly well controlled experiment as the third gen CAR is changing too many things in relation to the TCR for the experiment to be interpreted. It would be easy to address this is a revised manuscript. To publish as is the discussion would need to acknowledge these limitations. The work is preliminary as science, but it might be useful to T cell engineering field to have this information as a preliminary report, which might be an argument for adding discussion of limitations, but going forward without more detailed analysis of mechanism.

      This is an excellent point and we have addressed it. See our response above on the new data of the 1st generation CAR.

      Reviewer #2 (Evidence, reproducibility and clarity):

      Summary:

      Provide a short summary of the findings and key conclusions (including methodology and model system(s) where appropriate).

      In this study, the authors have interrogated CAR signaling by imaging CD19-CAR microclusters as well as T cell signaling molecules recruited to CAR microclusters. They report differences spatial assembly between CAR and TCR microclusters that form on a lipid bilayer containing ligand. They also report that LAT is not required for CAR microcluster formation, recruitment of downstream signaling molecules or IL-2 production in Jurkat cells, while in primary T cells IL-2 production by CARs show more of a LAT dependence. From these observations, they conclude that CAR T cells have a rewired signaling pathway as compared to T cells that signal through the TCR.

      Major comments:

      • Are the key conclusions convincing?

      The conclusions made by the authors about CAR microclusters are convincing. However, the conclusion that there is a "rewired signaling network" different from TCR microclusters needs to be more convincingly demonstrated in side-by-side comparisons of TCR and CAR microclusters and synapses.

      1. One of the key conclusions in this study is that CAR microclusters form in the absence of LAT, but TCR microclusters require LAT (in JCam2.5 cells in Fig. 2 and primary T cells in Fig. 4B). The requirement of LAT for formation of TCR microclusters is surprising, given multiple reports (one of which the authors have cited) that TCRz and ZAP70 clusters form normally in the absence of LAT (pZAP microclusters form normally in JCam2.5 cells Barda-Saad Nature Immunology 2005 Figure 1; TCRz clusters form normally in LAT CRISPR KO Jurkat cells Yi et al., Nature Communications, 2019 Figure 5). The authors should carefully evaluate TCRz and ZAP70 clusters (that form upstream of LAT) in their assays.

      We thank the reviewer for raising this excellent point. LAT-independent TCR clusters were reported in the two papers mentioned by the reviewer, which we think is convincing. However, there is a key difference in the experimental settings between these two papers and ours. We use supported lipid bilayer to present MOBILE TCR-activating antibody to activate T cells, whereas these two papers used IMMOBILE TCR-activating antibody attached to the cover glass. We reasoned that the mobile surface of supported lipid bilayer more closely mimics the antigen-presenting cell surface where antigens are mobile on the membrane. We added a new discussion about the difference between supported lipid bilayer and cover glass-based activation.

      We agree with the reviewer on the careful evaluation of TCR and ZAP70 clusters. We had showed the data of TCR clusters as marked by TCR-interacting OKT3 (Fig 3A). We performed new experiments on ZAP70 clusters (new Fig EV3). Our data suggest that, similar to TCR clusters, ZAP70 clusters are not formed in LAT-deficient T cells, if activated by OKT3, but are formed if activated by CD19.

      1. The authors make major conclusions about LAT dependence and independence of TCR and CAR microclusters respectively, by using JCam2.5 Jurkat cells and CRISPR/Cas9 edited primary cells. Of relevance to this conclusion, differences in the phosphorylation status of ZAP70 and SLP76 have been described between JCam2.5 cells lacking LAT (in which LAT was found to be deleted by gamma radiation) and J.LAT cells (in which LAT was specifically deleted by CRISPR/Cas9 in Lo et al Nature Immunology 2018). Of importance, pZAP and pSLP76 appeared fairly intact in J.LAT cells, but absent in JCam2.5 cells (Lo et al., Nat Immunol. 2018, Supp Fig 2). Therefore, the authors should evaluate TCRz, ZAP70, Gads and SLP76 in TCR and CAR microclusters in J.LAT cells. This may partly explain the discrepancy in LAT requirement for IL-2 production in JCam2.5 cells and primary cells with LAT CRISPRed out.

      Jcam2.5 is a classical well-characterized LAT-deficient cell line that has been continuously used in the T cell signaling field (Barda-Saad Nature Immunology 2005, Rouquette-Jazdanian A, Mol. Cell, 2012; Balagopalan L, J Imm. 2013; Carpier J, J Exp Med, 2018; Zucchetti A, Nat. Comm. 2019). We agreed with the concern that the reviewer raised on the absence of pZAP70 and pSLP76 in JCam2.5 cells. As the reviewer suggested, we obtained J.LAT, which is LAT null but has intact pZAP70 and pSLP76. We introduced CAR into J.LAT and the wild-type control and performed the clustering assay as we did for Jcam2.5. Our results showed that, similar to Jcam2.5, CAR forms robust microclusters in J.LAT cells (new Fig EV2). More importantly, we presented data confirming the LAT-independent CAR clustering, SLP76 phosphorylation, and IL-2 production in human primary T cells (Fig 7). Therefore, the data from three independent cell sources support our conclusion on LAT-independent CAR signal transduction.

      1. Since the authors are reporting differences between CAR synapses and TCR synapses, the authors should show side by side comparison of CAR and TCR synapses in Figure 1F.

      We focused on characterizing CAR synapse in this manuscript and did not make any conclusion on the difference between TCR and CAR synapse. We are cautious about comparing CAR synapse to TCR synapse for technical reasons: it is critical to use antigen-specific TCRs (e.g. mouse OTI as a common model) to study the TCR synapse pattern so that the study will be physiologically relevant. However, we use human T cell line and human primary T cells for the CAR study. The technical barrier to introduce an antigen-specific TCR complex into these cells, and to activate these cells by purified peptide-MHC complex, is very high. And the result is interesting, but beyond the scope of the current work.

      1. The authors should evaluate Gads microcluster formation in response to TCR stimulation via OKT3 (in Figure 4A). Given that it has been reported that TCRz, Grb2 and c-Cbl are recruited to microclusters in Jurkat cells lacking LAT by CRISPR deletion (Yi et al., Nature Communications, 2019), it is important to establish the differences between TCR microclusters and CAR microclusters in side by side comparisons in their assay system.

      As the reviewer suggested, we evaluated Gads microcluster formation with TCR stimulation and found that Gads did not form microclusters in LAT-deficient cells (new Fig 5A). Because we only made conclusions on the Gads-SLP76 pathway, we think investigating Grb2 and c-Cbl microcluster, though interesting, is beyond the scope of this manuscript.

      1. Similar to the comment about Gads above, the authors should evaluate pSLP76 microcluster formation in response to TCR stimulation via OKT3 in primary T cells lacking LAT in Figure 4C, i.e. side by side comparisons of pSLP76 in TCR and CAR synapses (with and without LAT) should be shown.

      We totally agree and performed new experiment on pSLP76 in human primary T cells. Our data suggested that, similar to Jurkat, pSLP76 microclusters remain intact in LAT null primary cells (new Fig 7D and 7E).

      • Should the authors qualify some of their claims as preliminary or speculative, or remove them altogether?
      1. The data shown in Figure 3C shows a reduction in conjugate formation from 80% (WT) to 30% (LAT -). This is a severe reduction and does not support the authors' claim in the corresponding Figure legend that "LAT is dispensable for cell conjugate formation between Jurkat T cells expressing CAR and Raji B cells" and the Abstract that "LAT.....is not required for....immunological synapse formation". Statistical analysis for variance should be shown here.

      We agree with the reviewer’s judgement. This cell conjugation analysis was performed using Jcam2.5 cells. As pointed by the reviewer, Jcam2.5 has additional defects in ZAP70 and SLP76 in addition to the lack of LAT. Therefore, we performed the same analysis again using J.LAT cells, which was recommended by the reviewer. Our new data showed that J.LAT cells form conjugates with Raji B cells in a similar rate as the wild-type cells do, as evaluated by statistical analysis (new Fig 6A). Therefore, we think these new data support the claim that LAT is dispensable for cell conjugate formation.

      1. In a similar vein, based on data from Movie S5 (where in a single cell, CAR microclusters translocate from cell periphery to center), and Figure 3C where (as described above in point 1) conjugate formation appears to be severely reduced, the authors conclude in the Results and Abstract that "LAT....is not required for actin remodeling following CAR activation". This conclusion is not supported by the data and the authors should remove this claim. Alternatively, actin polymerization in CAR expressing cells (that are LAT sufficient and deficient) can be easily evaluated using phalloidin or F-Tractin.

      As suggested by the reviewer, we evaluated actin polymerization in TCR or CAR stimulated cells using a filamentous actin reporter F-tractin. Our data showed that LAT is required for TCR-induced but not CAR-induced actin polymerization (new Fig 5C). Therefore, our results support the claim that LAT is not required for actin remodeling following CAR activation.

      • Would additional experiments be essential to support the claims of the paper?<br> Request additional experiments only where necessary for the paper as it is, and do not ask authors to open new lines of experimentation.

      Yes. Please see major comments above.

      • Are the suggested experiments realistic in terms of time and resources? It would help if you could add an estimated cost and time investment for substantial experiments.

      Yes. It should take 3 months to complete these experiments, since reagents and experimental systems to do these experiments already exist.

      • Are the data and the methods presented in such a way that they can be reproduced?<br> Yes. Methods are clearly explained.

      We appreciate the reviewer’s recognition of the clarity of the methods part.

      • Are the experiments adequately replicated and statistical analysis adequate?

      There is no statistical analysis to evaluate differences between samples in Figures 3 and 4. These must be included.

      We now added statistical analysis in Fig 5B and 6A (old figure 3 and 4).

      Minor comments:

      • Specific experimental issues that are easily addressable.

      Please see Major Comments above. We believe that the recommended experiments are not difficult to execute since reagents exist and experimental systems are already set up.

      • Are prior studies referenced appropriately?

      Authors reference 13 and 14 for the following sentence in Results section 2: "Deletion or mutation of LAT impairs formation of T cell microclusters". However, in Reference 14 Barda-Saad et al., actually show that pZAP clusters are intact in JCam2.5 cells lacking LAT. Perhaps authors should clarify that LAT (and downstream signaling molecule) microclusters are impaired when LAT is deleted or mutated.

      As the reviewer suggested, we now clarified that clustering of LAT downstream binding partners is impaired when citing reference (Barda-Saad et al).

      • Are the text and figures clear and accurate?

      Yes. But would be helpful if authors specify what "control" is in Fig. 3B and C. In Figure 3B it is lipid bilayers without CD19, while in 3C it is K562 cells that do not express CD19.

      We now specified “control” in the figure.

      • Do you have suggestions that would help the authors improve the presentation of their data and conclusions?

      Would be helpful if authors specify in every Figure or at least Figure legend the experimental bilayer system/ligand used, since they use both OKT3 and CD19 as ligands in the paper.

      We now specified the ligand in the figure or legend.

      Reviewer #2 (Significance):

      • Describe the nature and significance of the advance (e.g. conceptual, technical, clinical) for the field.

      If CAR microclusters and synapses are appropriately compared in a side by side comparison with TCR microclusters and synapses (as described in comments above), this study will be a conceptual advance in the field of CAR signaling. CAR microclusters have not been studied previously.

      • Place the work in the context of the existing literature (provide references, where appropriate).

      Very little imaging has been done on CAR synapses and to our knowledge this is the first live cell imaging study describing CAR microclusters.

      We appreciate this reviewer’s comment on our work as a conceptual advance in understanding CAR signaling.

      • State what audience might be interested in and influenced by the reported findings.<br> This study will have a broad audience. Both scientists that study basic T cell signaling as well as clinicians that use CAR Ts will be interested in this study.

      We appreciate this reviewer’s recognition of the broad audience of this manuscript.

      • Define your field of expertise with a few keywords to help the authors contextualize your point of view. Indicate if there are any parts of the paper that you do not have sufficient expertise to evaluate.

      T cell signaling and imaging of proximal T cell signaling responses.

      Reviewer #3 (Evidence, reproducibility and clarity):

      This manuscript by Dong and colleagues characterizes the molecular requirements and consequences of engaging a third-generation chimeric antigen receptor (CAR) directed to CD19. Utilizing a biological system of JCaM2.5, a Jurkat T cell mutant with dramatically low levels of LAT, expressing a CAR directed to CD19 fused to the cytoplasmic tails of CD28, 4-1BB and CD3z that is activated by CD19/ICAM1 reconstituted lipid bilayers, the authors demonstrate LAT is not required for microcluster formation, immunologic synapse formation or recruitment of GADS and pSLP76 to the plasma membrane. In contrast, LAT was required for anti-CD3 mediated microcluster formation and pSLP76 recruitment to the plasma membrane. However, LAT does appear to contribute to efficient synapse formation, PIP2 hydrolysis and IL-2 secretion when CAR+ JCaM2.5 or primary T cells are presented with Raji B cells, respectively. These data provide intriguing insights into the molecular requirements for third-generation CAR-T cell functions. The authors have developed quite a nice system to understand the molecular contributions for CAR-T function. A few suggestions are provided here to further enhance the accuracy and significance of the findings:

      1. The authors can address whether the LAT-independent effects are due to the attributes of third generation CAR-Ts with inclusion of CD28 and 4-1BB cytoplasmic domains or whether these differences are intrinsic to all CAR-Ts (e.g., first and second generation CARs).

      This is an excellent point. We have included new data showing LAT-independent cluster formation of the 1st generation CAR in both Jurkat and primary T cells (new Fig 4 and EV5). Therefore, we favor the second possibility as pointed by the reviewer that LAT-independent effects are intrinsic to CAR architecture.

      1. Since a first-generation CAR-T forms non-conventional synapses (Davenport, et al., PNAS 2018), the authors should consider more detailed kinetic analysis to understand the formation and dissolution of the constituents of the synapse with their third generation CAR. This should include measurements of the duration of microcluster and synapse formation as well as further analysis of c- and p-SMAC constituents (e.g., LFA-1, TALIN, LCK and pSLP76) over time.

      We agree with the reviewer on a more detailed characterization of the CAR synapse. We measured the duration of the unstable CAR synapse and time from cell landing to the start of retrograde flow (new Fig 2C). We also determined the localization of CD45, a marker for d-SMAC (new Fig 2D). We found that the formation of dSMAC is also not common in CAR T synapse, strengthening our conclusion that CAR forms non-typical immunological synapse.

      1. The authors utilize two different activation platforms. While using CD19/ICAM1 reconstituted bilayers, CAR+ JCaM2.5 or CAR+ primary T cells demonstrate no differences compared to wildtype JCaM2.5 cells in the parameters studied. However, when using Raji B cells, the CAR+ JCaM2.5 cells or CAR+ primary T cells demonstrate a more intermediate phenotype with respect to cell conjugate formation (Figure 3C) and IL-2 production (Figure 4D). The authors should analyze whether the differences attributed to the different outcomes may be due to the stimulation mode. For example, is c-SMAC assembly and GADS or pSLP76 recruitment to the plasma membrane still LAT-independent when activated with Raji B cells?

      As the reviewer suggested, we examined c-SMAC assembly in Raji B cells conjugated with CAR T cells. We found that the majority of CAR do not form cSMAC (new Fig EV4), which is consistent with the result from the bilayer activation system. Since both Gads and SLP76 are cytosolic proteins, they keep largely in the cytosolic pool which obscures their recruitment and clustering on the plasma membrane when imaged by confocal microscopy at the cross-section of cell-cell synapse.

      1. The authors should consider whether CAR expression level affects their observations. For example, do lower levels of CAR expression make the system LAT-dependent? Further, what is the level of the CAR relative to endogenous TCR expression on their primary T cells.

      We agree with the reviewer that it is informative to determine if LAT-independent signaling is dose dependent. We tried to measure the CAR concentration relative to the endogenous TCR/CD3z. By western blot using two different antibodies against CD3z, we detected TCR/CD3z expression, but found no bands corresponding to CAR. We believe this reflects a low expression of CAR in our system, which is confirmed by FACS. The general low expression of CAR makes it challenging to sort an even lower CAR-expressing population. Therefore, we sought alternative ways to determine the dose-dependence; we titrated the CD19 concentrations on the bilayer. As shown in the new Figure EV1, CAR formed microclusters similarly in the wild-type versus LAT-deficient cells in a wide range of CD19 concentration. Therefore, we conclude that the LAT-independent cluster formation is robust at low antigen density as well.

      Minor comment:

      1. Since JCaM2.5 has differences when compared to the parental Jurkat E6.1 T cell line, the authors should utilize JCaM2.5 reconstituted with wildtype LAT as a comparator.<br> Agreeing with this reviewer, we recognized that Jcam2.5 was generated by mutagenesis which may result in protein expression difference for genes besides Lat. As suggested by reviewer1, we used J.LAT, a genuine LAT knockout cell line that is generated by CRISPR-mediated gene targeting, to perform the clustering assay (new Fig EV2). Our results showed that, similar to Jcam2.5, CAR but not the TCR formed microclusters in J.LAT cells.

      Reviewer #3 (Significance):

      The mechanism(s) by which CAR-Ts function is of high significance from both scientific and clinical viewpoints. From a scientific viewpoint, it provides important basic mechanistic information of how T cells are being activated to kill tumor cells. By understanding the molecular requirements, additional generations of CARs can be designed to provide greater efficacy, overcome resistance and possibly less toxicity.

      This is an evolving field and little is known to date. Hence, this study could represent an insightful and important advance to the field.

      Audience is to both basic immunologist and cancer biologists.

      We appreciate this reviewer’s comments on the high significance of our work to the field of both basic immunology and clinical application.

      My expertise is in T cell signaling, T cell biology and immunotherapy.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      This manuscript by Dong and colleagues characterizes the molecular requirements and consequences of engaging a third-generation chimeric antigen receptor (CAR) directed to CD19. Utilizing a biological system of JCaM2.5, a Jurkat T cell mutant with dramatically low levels of LAT, expressing a CAR directed to CD19 fused to the cytoplasmic tails of CD28, 4-1BB and CD3 that is activated by CD19/ICAM1 reconstituted lipid bilayers, the authors demonstrate LAT is not required for microcluster formation, immunologic synapse formation or recruitment of GADS and pSLP76 to the plasma membrane. In contrast, LAT was required for anti-CD3 mediated microcluster formation and pSLP76 recruitment to the plasma membrane. However, LAT does appear to contribute to efficient synapse formation, PIP2 hydrolysis and IL-2 secretion when CAR+ JCaM2.5 or primary T cells are presented with Raji B cells, respectively. These data provide intriguing insights into the molecular requirements for third-generation CAR-T cell functions.

      The authors have developed quite a nice system to understand the molecular contributions for CAR-T function. A few suggestions are provided here to further enhance the accuracy and significance of the findings:

      1. The authors can address whether the LAT-independent effects are due to the attributes of third generation CAR-Ts with inclusion of CD28 and 4-1BB cytoplasmic domains or whether these differences are intrinsic to all CAR-Ts (e.g., first and second generation CARs).
      2. Since a first-generation CAR-T forms non-conventional synapses (Davenport, et al., PNAS 2018), the authors should consider more detailed kinetic analysis to understand the formation and dissolution of the constituents of the synapse with their third generation CAR. This should include measurements of the duration of microcluster and synapse formation as well as further analysis of c- and p-SMAC constituents (e.g., LFA-1, TALIN, LCK and pSLP76) over time.
      3. The authors utilize two different activation platforms. While using CD19/ICAM1 reconstituted bilayers, CAR+ JCaM2.5 or CAR+ primary T cells demonstrate no differences compared to wildtype JCaM2.5 cells in the parameters studied. However, when using Raji B cells, the CAR+ JCaM2.5 cells or CAR+ primary T cells demonstrate a more intermediate phenotype with respect to cell conjugate formation (Figure 3C) and IL-2 production (Figure 4D). The authors should analyze whether the differences attributed to the different outcomes may be due to the stimulation mode. For example, is c-SMAC assembly and GADS or pSLP76 recruitment to the plasma membrane still LAT-independent when activated with Raji B cells?
      4. The authors should consider whether CAR expression level affects their observations. For example, do lower levels of CAR expression make the system LAT-dependent? Further, what is the level of the CAR relative to endogenous TCR expression on their primary T cells.

      Minor comment:

      1. Since JCaM2.5 has differences when compared to the parental Jurkat E6.1 T cell line, the authors should utilize JCaM2.5 reconstituted with wildtype LAT as a comparator.

      Significance (Required)

      The mechanism(s) by which CAR-Ts function is of high significance from both scientific and clinical viewpoints. From a scientific viewpoint, it provides important basic mechanistic information of how T cells are being activated to kill tumor cells. By understanding the molecular requirements, additional generations of CARs can be designed to provide greater efficacy, overcome resistance and possibly less toxicity.

      This is an evolving field and little is known to date. Hence, this study could represent an insightful and important advance to the field.

      Audience is to both basic immunologist and cancer biologists.

      My expertise is in T cell signaling, T cell biology and immunotherapy.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary:

      Provide a short summary of the findings and key conclusions (including methodology and model system(s) where appropriate).

      In this study, the authors have interrogated CAR signaling by imaging CD19-CAR microclusters as well as T cell signaling molecules recruited to CAR microclusters. They report differences spatial assembly between CAR and TCR microclusters that form on a lipid bilayer containing ligand. They also report that LAT is not required for CAR microcluster formation, recruitment of downstream signaling molecules or IL-2 production in Jurkat cells, while in primary T cells IL-2 production by CARs show more of a LAT dependence. From these observations, they conclude that CAR T cells have a rewired signaling pathway as compared to T cells that signal through the TCR.

      Major comments:

      Are the key conclusions convincing?

      The conclusions made by the authors about CAR microclusters are convincing. However, the conclusion that there is a "rewired signaling network" different from TCR microclusters needs to be more convincingly demonstrated in side-by-side comparisons of TCR and CAR microclusters and synapses.

      1. One of the key conclusions in this study is that CAR microclusters form in the absence of LAT, but TCR microclusters require LAT (in JCam2.5 cells in Fig. 2 and primary T cells in Fig. 4B). The requirement of LAT for formation of TCR microclusters is surprising, given multiple reports (one of which the authors have cited) that TCR and ZAP70 clusters form normally in the absence of LAT (pZAP microclusters form normally in JCam2.5 cells Barda-Saad Nature Immunology 2005 Figure 1; TCR clusters form normally in LAT CRISPR KO Jurkat cells Yi et al., Nature Communications, 2019 Figure 5). The authors should carefully evaluate TCR and ZAP70 clusters (that form upstream of LAT) in their assays.
      2. The authors make major conclusions about LAT dependence and independence of TCR and CAR microclusters respectively, by using JCam2.5 Jurkat cells and CRISPR/Cas9 edited primary cells. Of relevance to this conclusion, differences in the phosphorylation status of ZAP70 and SLP76 have been described between JCam2.5 cells lacking LAT (in which LAT was found to be deleted by gamma radiation) and J.LAT cells (in which LAT was specifically deleted by CRISPR/Cas9 in Lo et al Nature Immunology 2018). Of importance, pZAP and pSLP76 appeared fairly intact in J.LAT cells, but absent in JCam2.5 cells (Lo et al., Nat Immunol. 2018, Supp Fig 2). Therefore, the authors should evaluate TCR, ZAP70, Gads and SLP76 in TCR and CAR microclusters in J.LAT cells. This may partly explain the discrepancy in LAT requirement for IL-2 production in JCam2.5 cells and primary cells with LAT CRISPRed out.
      3. Since the authors are reporting differences between CAR synapses and TCR synapses, the authors should show side by side comparison of CAR and TCR synapses in Figure 1F.
      4. The authors should evaluate Gads microcluster formation in response to TCR stimulation via OKT3 (in Figure 4A). Given that it has been reported that TCR, Grb2 and c-Cbl are recruited to microclusters in Jurkat cells lacking LAT by CRISPR deletion (Yi et al., Nature Communications, 2019), it is important to establish the differences between TCR microclusters and CAR microclusters in side by side comparisons in their assay system.
      5. Similar to the comment about Gads above, the authors should evaluate pSLP76 microcluster formation in response to TCR stimulation via OKT3 in primary T cells lacking LAT in Figure 4C, i.e. side by side comparisons of pSLP76 in TCR and CAR synapses (with and without LAT) should be shown.

      Should the authors qualify some of their claims as preliminary or speculative, or remove them altogether?

      1. The data shown in Figure 3C shows a reduction in conjugate formation from 80% (WT) to 30% (LAT -). This is a severe reduction and does not support the authors' claim in the corresponding Figure legend that "LAT is dispensable for cell conjugate formation between Jurkat T cells expressing CAR and Raji B cells" and the Abstract that "LAT.....is not required for....immunological synapse formation". Statistical analysis for variance should be shown here.
      2. In a similar vein, based on data from Movie S5 (where in a single cell, CAR microclusters translocate from cell periphery to center), and Figure 3C where (as described above in point 1) conjugate formation appears to be severely reduced, the authors conclude in the Results and Abstract that "LAT....is not required for actin remodeling following CAR activation". This conclusion is not supported by the data and the authors should remove this claim. Alternatively, actin polymerization in CAR expressing cells (that are LAT sufficient and deficient) can be easily evaluated using phalloidin or F-Tractin.

      Would additional experiments be essential to support the claims of the paper?<br> Request additional experiments only where necessary for the paper as it is, and do not ask authors to open new lines of experimentation.

      Yes. Please see major comments above.

      Are the suggested experiments realistic in terms of time and resources? It would help if you could add an estimated cost and time investment for substantial experiments.

      Yes. It should take 3 months to complete these experiments, since reagents and experimental systems to do these experiments already exist.

      Are the data and the methods presented in such a way that they can be reproduced?

      Yes. Methods are clearly explained.

      Are the experiments adequately replicated and statistical analysis adequate?

      There is no statistical analysis to evaluate differences between samples in Figures 3 and 4. These must be included.

      Minor comments:

      Specific experimental issues that are easily addressable.

      Please see Major Comments above. We believe that the recommended experiments are not difficult to execute since reagents exist and experimental systems are already set up.

      Are prior studies referenced appropriately?

      Authors reference 13 and 14 for the following sentence in Results section 2: "Deletion or mutation of LAT impairs formation of T cell microclusters". However, in Reference 14 Barda-Saad et al., actually show that pZAP clusters are intact in JCam2.5 cells lacking LAT. Perhaps authors should clarify that LAT (and downstream signaling molecule) microclusters are impaired when LAT is deleted or mutated.

      Are the text and figures clear and accurate?

      Yes. But would be helpful if authors specify what "control" is in Fig. 3B and C. In Figure 3B it is lipid bilayers without CD19, while in 3C it is K562 cells that do not express CD19.

      Do you have suggestions that would help the authors improve the presentation of their data and conclusions?<br> Would be helpful if authors specify in every Figure or at least Figure legend the experimental bilayer system/ligand used, since they use both OKT3 and CD19 as ligands in the paper.

      Significance (Required)

      Describe the nature and significance of the advance (e.g. conceptual, technical, clinical) for the field.

      If CAR microclusters and synapses are appropriately compared in a side by side comparison with TCR microclusters and synapses (as described in comments above), this study will be a conceptual advance in the field of CAR signaling. CAR microclusters have not been studied previously.

      Place the work in the context of the existing literature (provide references, where appropriate).

      Very little imaging has been done on CAR synapses and to our knowledge this is the first live cell imaging study describing CAR microclusters.

      State what audience might be interested in and influenced by the reported findings.

      This study will have a broad audience. Both scientists that study basic T cell signaling as well as clinicians that use CAR Ts will be interested in this study.

      Define your field of expertise with a few keywords to help the authors contextualize your point of view. Indicate if there are any parts of the paper that you do not have sufficient expertise to evaluate.

      T cell signaling and imaging of proximal T cell signaling responses.

    5. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      The authors compare the TCR alone to a CAR that contains signaling modules from three receptors- TCR, CD28 and 41BB. The data quality if good and the experiments done are. The difference is quite clear, and I would even like to see a little more of the evidence related to failure of the TCR system.

      More specifically:

      Su and colleagues show that a third generation CAR with TCR zeta, CD28 and 41BB signal transduction pathways can activate a T cell for microcluster formation and Gads/SLP-76 recruitment, but not IL-2 production, without LAT. This is surprising because LAT is generally considered, as is up held here, as an essential adapter protein for T cell activation. However, this is not a "fair" experiment as the CAR has sequences from TCR, and two co-stimulatory receptor- CD28 and 41BB. It would be important and very straight-forward to test first and second generation CARs to determine if LAT independence is a function of the CAR architecture itself, or the additional costimulatory sequences. If it turns out that a first generation CAR with only TCR sequences can trigger LAT independent clustering and SLP-76 recruitment then the comparison would be fair and no additional experiment would be needed to make the point that the CAR architecture is intrinsically LAT independent. If the CD28 and/or 41BB sequences are needed for LAT independence then the fair comparison would be to co-crosslink TCR, CD28 and 41BB (an inducible costimulator such that anti-CD27 might be substituted to have a constitutively expressed receptor with this similar motifs) should be cross-linked with the TCR to make this a fair comparison between the two architectures.

      The authors may want to cite work from Vignali and colleagues that even the TCR has two signaling modules- the classical ZAP-70/LAT module that is responsible to IL-2 and a Vav/Notch dependent module that controls proliferation. Its not clear to me that the issue raised about distinct signaling by CARs is completely parallel to this, but its interesting that Vignali also associated the classical TCR signaling pathway as responsible for IL-2 with an alterive pathways that uses the same ITAMs to control distinct functions. See Guy CS, Vignali KM, Temirov J, Bettini ML, Overacre AE, Smeltzer M, Zhang H, Huppa JB, Tsai YH, Lobry C, Xie J, Dempsey PJ, Crawford HC, Aifantis I, Davis MM, Vignali DA. Distinct TCR signaling pathways drive proliferation and cytokine production in T cells. Nat Immunol. 2013;14(3):262-70.

      I would be very interested to see a movie of the LAT deficient T cells interacting with the anti-CD3 coated bilayers in Figure 2A. Since OKT3 has a high affinity for CD3 and is coated on the suface at a density that should engage anti-CD3 I'm surprised there is no clustering even simply based on mass action. The result looks almost like a dominant negative effect of LAT deficiency on a high affinity extracellular interaction. It would be interesting to see how this interface evolves or if there is anti-adhesive behavior that emerges.

      Significance

      While it interesting that the CAR is LAT independent, its obvious that the signalling networks are different as the CAR has two sets of motifs that are absent in the TCR, so the experiments as presented are not that insightful about the specific nature of the differences that lead to the different outcomes. At present its not a particularly well controlled experiment as the third gen CAR is changing too many things in relation to the TCR for the experiment to be interpreted. It would be easy to address this is a revised manuscript. To publish as is the discussion would need to acknowledge these limitations. The work is preliminary as science, but it might be useful to T cell engineering field to have this information as a preliminary report, which might be an argument for adding discussion of limitations, but going forward without more detailed analysis of mechanism.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Rebuttal to reviewers ReviewCommons manuscript # RC-2020-00281

      We would like to thank the reviewers and editors of Review Commons for evaluating our manuscript entitled “Transcriptional comparison of Testicular Adrenal Rest Tumors with fetal and adult tissues” and providing their valuable comments. We have listed the reviewers’ comments along with our response and amendments below.

      Board Advice on initial submission:

      This seems to be a study mainly relevant to the field of Testicular Adrenal Rest Tumors (TART). It presents the first RNAseq profiling of these tumors in multiple human samples at different stages. This has the potential to advance knowledge in this particular field. It would be less interesting to researchers interested in tissue spatial transcriptomics in general, since the experimental and computational tools are quite standard, but the findings may be important to the TART field.

      Response: Indeed, this is the first study using transcriptomics to characterize Testicular Adrenal Rest Tumors, a frequent occurrence in patients with Congenital Adrenal Hyperplasia. It is also the first to find that the reported adrenal and testicular features of these tumors can be found in a single cell. We therefore believe this study is not only of interest to those working in the TART field, but also in development, endocrinology and andrology in general.

      Comments Reviewer #1:

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      **Summary:**

      The manuscript by Schroder M., et al describes the whole transcriptome of testicular adrenal rest tumors (TART) and shows that TART tissue is characteristically similar to adult adrenal and testicular rather than fetal adrenal and testicular tissues. The authors propose that their previous claim that TART is derived from an undifferentiated pluripotent progenitor is likely untrue and claim that TART likely originates from a mature cell type with both adrenal and testicular characteristics. The authors describe a unique cell type most similar to the adult adrenal, but with variable testis-specific gene expression patterns. The finding of overexpressed genes associated with ECM remodeling is interesting and may provide insight into the natural history of these tumors. A strength of the study is the number of tissue samples since surgery for these rare tumors is usually not performed.

      **Major Comments:**

      • The key conclusions are mostly based on RNA studies, thus their claims are preliminary.

      Response: We agree that a major part of our conclusions is based on RNA studies. Although indeed primarily based on transcriptomics, this claim is, in our opinion, not preliminary as the identity of TART cells can definitively be deduced from their expression profile. Our second key conclusion, i.e. that TART cells comprise both adrenal and testicular features within the same, unique, TART specific cell, is based on immunohistochemistry of adrenal and testis-specific enzymes.

      In Figure 1/Result p. 3: Authors claim that there were no exclusive HSD17B3 staining cells without CYP11B1, however Figure 1 looks like there are exclusively green (HSD17B) areas (especially TART3). The authors need to address this. It appears as if there are mature Leydig cells. This is important because the presence of Leydig cells would affect the interpretation of the findings.

      Response: We do understand the concern of the reviewer. Aspecific background staining for HSD17B3 in TART samples complicated the differentiation between specific and background staining. This can be seen when comparing the staining in HSD17B3-positive (Leydig) cells with the background staining in non-Leydig cells in testis tissue and in a portion of TART cells. In TART, we found that cells with high intensity, specific HSD17B3 staining all also showed CYP11B1 staining, but not vice-versa. However, we do acknowledge that due to this -most likely background- staining, the occurrence of mature Leydig cells in TART cannot be completely excluded based on our results.

      Therefore, we have tried to be more careful in our claims in the results section (page 3; TART cells express adrenal- and Leydig cell-specific steroidogenic enzymes paragraph) and we have addressed this in the discussion section (page 5/6):

      High background staining for HSD17B3 complicated the differentiation between specific and background staining. For some cells this exclusive HSD17B3 staining might have been specific and therefore, despite that most HSD17B3-positive cells were positive for CYP11B1, the absence of mature Leydig cells in TART could not be guaranteed by these results.

      Discussion: authors state that based on their previous observations that fetal Leydig cells have both adrenal and testis developmental potential. It was speculated that TART might have been derived from a totipotent progenitor cell type, but the current study shows that these tumors lack similarities with fetal tissues. Thus, the authors claim that these tumors are not derived from the transdifferentiation of pluripotent cells. However what is the origin of this mature distinct cell type? Is it not possible that this distinctive cell type is derived from a common progenitor since the testis and adrenal gland are derived from the same adrenogonadal primordium? Lack of similarities with fetal tissues at this late stage of development does not necessarily rule out a common progenitor origin.

      Response: In this study, we compared the TART transcriptome with fetal tissues, as we hypothesized these might be similar considering the likely progenitor origin of TART cells. However, this was not the case, and we showed that the transcriptomic profile of TART resembles the transcriptomic profile of mature cell types, rather than their fetal counterparts. Therefore, we conclude that the hypothesis that TART arises from progenitor cells is not supported by our data. The reviewer is correct that we did not prove that it is not derived from pluripotent cells. We have therefore added the following text to the discussion:

      Although we here find that the transcriptome of TART tissues are clearly distinct from fetal tissues, we did not prove that TART does not originate from fetal Leydig cells. TART being derived from a multipotent progenitor cell is still possible as we initially hypothesized, given the fact that TART is likely already present in utero and its resemblance to both testis and adrenal tissues which derive from a common primordium. Therefore, we were surprised to find TART to be more like adult adrenal and testis tissue, raising the possibility of TART being derived from a ‘mature’ progenitor cell type, i.e. adult stem Leydig cells or adrenal progenitor cells, that under influence of high ACTH levels and/or the localization in the testicular region might differentiate into a distinct cell type that expresses both adrenal- and testis-specific markers. However, this remains to be established.

      **Minor Comment:**

      In Methods: Was RNA isolated from FFPE sections or frozen tissue?

      We agree that this was not clearly mentioned enough in our original manuscript, as both frozen (RNA isolation) and FFPE (IHC) material was used. We have now clarified in the methods section that the RNA was retrieved from frozen tissue samples (page 8; RNA isolation, library preparation, and sequencing paragraph).

      Reviewer #1 (Significance (Required)):

      This first study of transcriptome analysis of TART provides useful insight into the characteristics of these rare tumors that commonly develop in males with classic CAH. This study provides a foundation for further investigation of the biological pathways contributing to the development of TART, the most common cause of male infertility in CAH. This study is of interest to endocrinologists. Reviewed by a pediatric endocrinologist and molecular biologist - we are not completely aware of the sequencing analysis but are familiar with clustering and enrichment analysis.

      Comments Reviewer #3:

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      The manuscript by Schröder et al describes the transcriptome sequencing of TARTs in CAH/CS in order to sort out the origin of TARTs. This is an interesting subject and the manuscript is well-written but I have a few comments that could be addressed.

      • Some parts of the Results should be in the Methods and some in the Discussion. In the Results only the results should be given.

      Response: We agree that we have incorporated some methodological sentences and some concluding remarks in the results sections to, in our opinion, improve the flow of the manuscript. As the manuscript guidelines differ between journals, we have for now decided not to change this. We will do so if this is wanted by the concerning journal.

      Normally TARTs are not removed or biopsied, if not by mistake... Thus, most centers would not have tissue samples of TARTs at all. How come you have so many samples available?

      Response: We thank the reviewer for highlighting this. As indeed TARTs are not routinely removed, the number of TART tissues included in our dataset is unique. Most of the TART samples were already obtained in 2004 because of reported pain and discomfort and in an attempt to improve semen quality in these patients. Removal of those particular TART samples have led to new insights that removal of longstanding TART did not improve semen parameters, nor parameters of pituitary-gonadal function (Claahsen-van der Grinten et al., 2007). Therefore, to date, the only indication for surgery for the removal of longstanding TART is the relief of pain or discomfort.

      Ref 2 and 3 are rather old and similar. Could newer review references be used instead?

      Response: We have changed those two references for a more recent review by Dr. Witchel on Congenital Adrenal Hyperplasia, who addresses both statements in a more recent review (Witchel, 2017).

      Reviewer #3 (Significance (Required)):

      New and significant study. Very interesting for people dealing with CAH patients.

      References

      Claahsen-van der Grinten, H. L., Otten, B. J., Takahashi, S., Meuleman, E. J. H., Hulsbergen-van de Kaa, C., Sweep, F. C. G. J., & Hermus, A. R. M. M. (2007). Testicular adrenal rest tumors in adult males with congenital adrenal hyperplasia: Evaluation of pituitary-gonadal function before and after successful testis-sparing surgery in eight patients. Journal of Clinical Endocrinology & Metabolism, 92(2), 612-615. doi:10.1210/jc.2006-1311

      Witchel, S. F. (2017). Congenital Adrenal Hyperplasia. J Pediatr Adolesc Gynecol, 30(5), 520-534. doi:10.1016/j.jpag.2017.04.001

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      The manuscript by Schröder et al describes the transcriptome sequencing of TARTs in CAH/CS in order to sort out the origin of TARTs. This is an interesting subject and the manuscript is well-written but I have a few comments that could be addressed.

      1. Some parts of the Results should be in the Methods and some in the Discussion. In the Results only the results should be given.
      2. Normally TARTs are not removed or biopsied, if not by mistake... Thus, most centers would not have tissue samples of TARTs at all. How come you have so many samples available?
      3. Ref 2 and 3 are rather old and similar. Could newer review references be used instead?

      Significance

      New and significant study. Very interesting for people dealing with CAH patients.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary:

      The manuscript by Schroder M., et al describes the whole transcriptome of testicular adrenal rest tumors (TART) and shows that TART tissue is characteristically similar to adult adrenal and testicular rather than fetal adrenal and testicular tissues. The authors propose that their previous claim that TART is derived from an undifferentiated pluripotent progenitor is likely untrue and claim that TART likely originates from a mature cell type with both adrenal and testicular characteristics. The authors describe a unique cell type most similar to the adult adrenal, but with variable testis-specific gene expression patterns. The finding of overexpressed genes associated with ECM remodeling is interesting and may provide insight into the natural history of these tumors. A strength of the study is the number of tissue samples since surgery for these rare tumors is usually not performed.

      Major Comments:

      1. The key conclusions are mostly based on RNA studies, thus their claims are preliminary.

      2. In Figure 1/Result p. 3: Authors claim that there were no exclusive HSD17B3 staining cells without CYP11B1, however Figure 1 looks like there are exclusively green (HSD17B) areas (especially TART3). The authors need to address this. It appears as if there are mature Leydig cells. This is important because the presence of Leydig cells would affect the interpretation of the finidings

      3. Discussion: authors state that based on their previous observations that fetal Leydig cells have both adrenal and testis developmental potential. It was speculated that TART might have been derived from a totipotent progenitor cell type, but the current study shows that these tumors lack similarities with fetal tissues. Thus, the authors claim that these tumors are not derived from the transdifferentiation of pluripotent cells. However what is the origin of this mature distinct cell type? Is it not possible that this distinctive cell type is derived from a common progenitor since the testis and adrenal gland are derived from the same adrenogonadal primordium? Lack of similarities with fetal tissues at this late stage of development does not necessarily rule out a common progenitor origin.

      Minor Comment:

      In Methods: Was RNA isolated from FFPE sections or frozen tissue?

      Significance

      This first study of transcriptome analysis of TART provides useful insight into the characteristics of these rare tumors that commonly develop in males with classic CAH. This study provides a foundation for further investigation of the biological pathways contributing to the development of TART, the most common cause of male infertility in CAH. This study is of interest to endocrinologists. Reviewed by a pediatric endocrinologist and molecular biologist - we are not completely aware of the sequencing analysis but are familiar with clustering and enrichment analysis.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers


      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      The manuscript entitled "Vasohibin-1 mediated tubulin detyrosination selectively regulates secondary sprouting and lymphangiogenesis in the zebrafish trunk" by de Oliveira investigates the function of the carboxylpeptidase Vasohibin during the formation of the zebrafish trunk vasculature and reports a requirement of Vasohibin for secondary sprout formation and in particular the formation the lymphatic vasculature.

      Having established the expression of Vasohibin in sorted ECs of 24 hpf embryos, the remaining study addresses the function of Vasohibin in this cell type. It is largely based on the use of a splice-site interfering morpholino. Particular commendable is the analysis, demonstrating that the KD of vash-1 indeed results in a significant reduction of detyrosination in endothelial tubulin. Findings in the vascular system then include: (i) the detection of increased division and hence supernumerous cells occurring selectively in 2nd sprouts from the PCV; (ii) an increased persistence of the initially formed 3 way connections with ISV and artery; (iii) reduced formation of parachordal lymphangioblasts and (iv) a reduced number of somites with a thoracic duct segment; (v) frequent formation of lumenized connections between PLs (where present) and ISV. To demonstrate specificity, the approach was repeated with a different morpholino and defects were partially rescued by MO-insensitive RNA.

      Possible additional and relevant information could include data on a vash-1 promotor mutant to independently verify the MO-based functional analysis. Mutants would also allow analysis of further development, are the defects leading to the demise of the fish or is a later regeneration and normalization of the lymphatic vasculature observed?

      We agree that a mutant would be desirable to validate the phenotypic analysis of the morpholinos used, and would also allow for further analysis. However, this is not achievable within a reasonnable time frame, especially in the context of current work restrictions.

      In addtion to the two splice morpholinos currently used to knockdown vash-1 expression, we will use an ATG morpholino to further investigate our observations and hypothesis regarding the role of vash-1 in lymphatic vessels formation. We will also validate it by westernblot and attempt to rescue it with mRNA.

      We have not investigated the phenotype past 4 dpf. We will add investigation of lymphatics and morphology at 5 dpf.

      In addition, are other lymphatic vessel beds like the cranial lymphatics affected?

      Using the Tg[fli1a:EGFP]y7 line, we have not been able to identify apparent differences in other vascular beds including the cranial lymphatics. However a detailed fine-grained investigation of the cranial vascular bed has not been performed. Given the focus of the present study on the trunk vasculature to understand the mechanisms of vash-1, we feel that a detailed analysis of cranial lymphatics would at this stage be somewhat out of scope.

      PLs have been demonstrated to be at least partially guided in their movement by the CXCR4/SDF1 system and SVEP1. Has the expression of these factors been tested in vash-1 KDs?

      We have not investigated the potential role of the CXCR4/SDF1 system and SVEP1 in vash-1 regulation of lymphangiogenesis. We will investigate the expression of cxcr4a, cxcl12a, cxcl12b and svep1 by in situ hibridization upon vash-1 knockdown.

      With regards to the frequently observed connections of PLs and ISVs in vash-1 morphants, can the proposed lumen formation of these shunts be demonstrated e.g. by injection of Q-dots or microbeads into the circulation?

      Although the lumenisation is very clear thanks to the membrane targeted expression of the label in this line, we will further analyse whether these abberant ISV to ISV connection can be perfused by Q-dots injections.

      Concerning the mechanisms of these defects, is it possible to analyse the asymmetric cell division leading to 2nd sprouts in greater detail? Is the same number or are more cells sprouting form PCV and can the fli1ep:EGFP-DCX cell line in fixed samples be used to identify the spindle orientation in dividing cells?

      We agree with the reviewer and plan to use the Tg[fli1ep:EGFP-DCX] fish line to investigate spindle asymmetry in uninjected embryos, as well as compare the spindle in control MO and vash-1 KD embryos. Vash-1 has been shown to regulate spindle formation in osteosarcoma cells (Liao et al., 2019). We will attempt to clarify whether this function is conserved in endothelial cells and contributes to the control of endothelial cell proliferation during initiation and formation of secondary sprouting.

      We also agree that it is important to look at the PCV in the begining of secondary sprouting and will clarify whether the sprouting is initiated by an increased number of cells.

      **Minor issues:** Page 5, Mat & Meth, please spell out PTU at its first mention.

      This has been corrected accordingly (see page 4).

      Page 6 Mat & Meth, Secondary sprout and 3-way connection parameters: The number of nuclei was assessed in each secondary sprouts (del s, singular) just prior...

      This has been corrected accordingly (see page 5).

      Page 16, 8th line from bottom: Recent work demonstrated that a secondary sprout either contributes (add s) to remodelling a pre-existing ISV into a vein, or forms (add s)a PLs (Geudens et al., 2019).

      This has been corrected accordingly (see page 16).

      Page 25, Legend to Fig. 2D-G: "...G,G' shows quantification of dTyr signal upon vash-1 KD..." Fig2 G,G' show immunostaining rather than quantification of the dTyr signal, which is shown Fig. 2H-J

      This has been corrected accordingly (see page 26).

      Fig. 1D / Fig. 2H-J please increase weight of the error intervals and / or change colour for improved visibility

      This has been corrected accordingly (Fig. 1D and 2H-J), and we added n.s. to Fig. 1D.

      Reviewer #1 (Significance (Required)):

      Taken together the manuscript is comprehensively written and the study provides a conclusive analysis of the MO-mediated KD of Vasohibin in zebrafish embryonic development presenting significant novel findings. Known was a generally inhibitory function of Vasohibin on vessel formation and its enzymatic activity as a carboxylpeptidase responsible for tubulin detyrosination, affecting spindle function and mitosis. New is the detailed analysis of the Vasohibin KD on zebrafish trunk vessel formation and the description of a selective impairment of 2nd sprout formation. The manuscript is of interest for vascular biologists.

      REFEREES CROSS-COMMENTING

      I fully concur with the comments of reviewer #2, all three reviews find that this study is of significant interest to the vascular biology community as the relevance of tubulin detyrosination for developmental angiogenesis has not been investigated. Also all three reviews highlight the potential limitations of the use of splice morpholinos (suggested alternatives include ATG morpholinos and CRIPR mutants), the requirement to provide further evidence for a endothelial cell autonomous defect and the need to clarify some of the data representation.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      **Summary:**

      The manuscript by Bastos de Oliveira et al. describes an important investigation of the endothelial tubulin detyrosination during vascular development. Namely, they found detyronised microtubules in secondary sprouts, which is absent in MO-vash-1 treated embryos. The authors use the vash-1 morpholino approach to uncover the developmental consequences of suppressed detyrosination in angiogenesis and lymphangiogenesis in vivo in zebrafish. By a combination of transgenic lines, immunohistochemistry and time-lapse imaging, Bastos de Oliveira et al., have found that Vash-1 is a negative regulator of secondary sprouting in zebrafish. The authors showed that in the absence of Vash-1 more cells are present in the secondary sprouts due to increased cell proliferation; however lymphatic vascular network fails to form. The current manuscript requires additional experimental evidence to support the conclusions. Please see below the major technical concerns and minor comments.

      **Major comments:**

      -This study is based on analysis of the phenotypes observed in embryos injected with vash-1 morpholino. The authors use two different types of splice morpholinos, perform rescue experiments with RNA, and validate one MO-vash-1 with western blot. Morpholinos are not trivial to work with, and the results are variable hence additional controls need to be included, as following the recommendation put together by the zebrafish community (Stainier et, al., Plos Genetics, 2017). As the severity of the phenotypes comparing MO1 with MO2 is different and MO-vash-1 embryos appear developmentally delayed (Figure 2D-F and 5E-F overall size seem to be affected), additional MO is required, for example, ATG-MO or generation of CRISPR mutant would be favourable. All the morpholino used need to be validated using an antibody, RT-PCR and qPCR. It is essential to carry out the rescue experiments for all the MO used in this study and following the guidelines. Including the dose-response curve, data would be informative.

      We agree with the reviewer and the recommendations of the zebrafish community. We will investigate the phenotypes with another KD strategy, such as the ATG-Morpholino suggested by the reviewer. We will also supply more validation of the MO2 including RNA rescue and westernblot (already included in Fig. 5 I).

      We added dose-response curves (Supp. Figure 1 E,G) and a developmental morphology assessment for the morpholino 1 (Supp. Figure 1 A,B).

      Given our extensive analysis of the effects of vash-1 KD, we believe the embryos in 2F are not developmentally delayed. However, the image in figure 2F does give that impression, and therefore may have triggered the reviewer’s concerns. We double checked and found that due to an oversight, we included a picture from a slightly different region of the trunk in comparision to Fig. 2D. We will add pictures of the same trunk region (Fig.2D-F) as we have done in all other figures. We nonetheless supply a supplementary figure 1 showing and quantifying the development of the analysed vash-1 morphants.

      -In addition to EC, the levels of dTyr are lower in MO-vash-1 in neural tube and neurons spanning the trunk (Figrue 2 D-G'). These have been previously shown to be important for secondary sprouting. Is it possible that the observed phenotypes in the secondary sprouting are due to defects in these neurons?

      We agree with the reviewer that a potential contribution of altered neuronal differentiation to the vascular phenotype should be clarified. We will assess the morphology of the neurons and their dendrites relevant for pathfinding (Lim et al., 2011) in vash-1 KD embryos, using a pan-neuronal zebrafish line, as well as via immunostaining against alpha-tubulin. Should we find evidence for changes in neuronal cells, we will attempt to clarify a cell autonomous role of vash-1 by transplantation experiments.

      -Embryo number used in this study appears to be low especially in figure 3G, 5D, 5G, to conclude draw conclusions from these experiments, the number of embryos used should be higher than 20. Figure 4J please specify how many embryos were used.

      We will increase the number of embryos per condition to a minimum of 20 embryos and update the averages in the text for 3G (control: 7 and vash-1 KD: 11 embryos).

      In 5D and 5G each point is an embryo and more than 20 embryos per condition were used (in 5D 23-35 embryos per condition, in 5G 60-63 embryos/condition), we corrected the legend 5D and 5G (see page 27) and made it clear that each point in the graph corresponds to one embryo (5D- percentage of PLs associated with veins in each embryo; 5G- percentage of somites with toraxic duct in each embryo).

      In 4J, 18 embryos were used for control (about 3 sprouts/embryo– 52 sprouts quantified) and 7 embryos for vash-1* KD condition (about 3 sprouts/embryo – 24 sprouts quantified). We corrected the number of control sprouts in the legend and added the number of embryos to increase clarity (see page 27).

      -The authors hypothesise that VASH acts in the sprouting endothelial cells, based on the Q-PCR in Figure 1. However, in this experiment all EC have been sorted thus this remains ambiguous in which cell types vash-1 is expressed. Please provide the expression pattern for vash-1 across the developmental stages the phenotypes are observed.

      We agree with the reviewer that it would be beneficial to understand the expression pattern of vash-1 in wild type embryos. We plan to perform in situ hybridization for vash-1 mRNA.

      -Throughout the manuscript the authors refer the lymphatic identity, however, there is no evidence in the paper that the identity status has been assessed. To support these claims Prox1 immunohistochemistry or analysis of prox1 expression in the reporter line would be appropriate.

      We agree with the reviewer and plan to perform a Prox1 immunostaining (Koltowska et al., 2015) in vash-1 KD embryos at 34-36 hpf (secondary sprouting) to investigate Prox1 levels upon vash-1 KD.

      **Minor comments:**

      -The authors refer to the literature where overexpression of VASH suppresses the angiogenesis. As the RNA injections were used in rescue experiments, the data of vash-1 RNA injections into the wild-type embryos would be beneficial.

      We have injected vash-1 RNA into a control morpholino injected embryos (28 control embryos, 14 Vash-1 RNA injected embryos) and we observed a significant decrease in PLs at 52 hpf (average of -control: 87,5% somites with PLs to 67% somites with PLs in vash-1 RNA embryos). This could be due to a decrease of secondary sprouting, which would be in accordance with the current literature that vash-1 overexpression is anti-angiogenic. We will further investigate and add the results to figure 5. Figure 1. vash-1* mRNA injection leads to a decrease in somites with PLs (preliminary).

      -In figures 2J, 3J, 3K, 3N, 4J, 5C, 5D and 5G the N number was set for examples as the number of sprouts, the number of somites with TD, number of ISV. To strengthen the observation in the manuscript quantification of the sprouts, PL, vISVs and lymphatic phenotypes with N set as the number of embryos would be more informative. Indicating the number of embryos used, in the graphs, would be helpful.

      We agree with the reviewer and have added embryo numbers in all legends and graphs. In 2J, 3J, 3K, 4J each point is a sprout, a cell division or an ISV, corresponding to the N. We agree that the number of embryos could be more clearly stated, so we added the number of embryos analysed in the figure legend and will add them in the graphs.

      In 5C, 5D and 5G each point corresponds to an embryo (clarified in the legend of Fig. 5- see page 27).

      Fig. 5C refers to the percentage of somites with PLs in each embryo, 5D refers to percentage of the existing PLs in one embryo connected to a venous ISV, 5G corresponds to percentage of somites with a TD segment in each embryo.

      -In Figure 5A, B and D the authors quantify what they refer to as a lumenised connection between the vISVs and PL. In the control image (second star), a somewhat lumenised structure is present, clarification of how the scores were set is missing.

      In Fig. 5C we show a quantification of the percentage of somites with PLs per embryo, by counting the PLs identified with an asterisk in Fig. 5A-B. PLs are normally not lumenised, with few exceptions also ocurring in wild-type – see Fig. 4 in (S Isogai et al., 2001).

      In Fig. 5D we quantified the proportion of PLs associated/connected with venous ISvs (see Methods section page 6), by 52 hpf in control and vash-1 morphants.

      In 5B and 5F,F‘, the arrowheads identify lumenised PLs present in vash-1 KD embryos. We will add a quantification of kdr-l:ras-Cherry positive ISV-to-ISV connections, corresponding to the lumenised endothelial connections, since kdr-l:ras-Cherry signal labels endothelial (and not lymphatic) cells and is particularly strong at the luminal endothelial membrane of the vessel.

      -In Figure 3 E and F the authors show the excessive sprouting phenotype between controls and Mo-vash-1. The images presented are taking from different parts of the embryos (middle of the trunk vs plexus region), hampering the comparison between the two groups. The quantification of the phenotypes in both experimental groups should be in the same region of the embryo, as the local difference can occur. It is key to provide representative images to support these observations.

      The images presented are representative of the phenotype quantified, and the time-lapses were done in comparable regions of the zebrafish trunk (+- 1-2 somites in both groups due to drift during image aquisition), making the comparison possible.

      -Figure 1D the vash-1 expression levels in EC seem very variable in this graph, therefore no conclusion can be drawn from this data, especially as the authors do not provide the p-values.

      We added n.s. in the graph, to make it clear that the difference between developmental stages is not significant, potentially due to high biological variability between embryos, as seen in two primer pairs. We believe that presenting this biological variability is of importance to the readers.

      We write on page 12 about this result: „During the sprouting phase (24hpf), vash-1 expression was 5-7 times higher in endothelial than in non-ECs, decreasing at 48 hpf (Fig. 1C-D). Although these results are not significant, they were independently confirmed with a second primer set.”. The only conclusion we made from this data is that Vash-1 is dynamically expressed in the zebrafish endothelium during development, as we now added in the discussion (page 14).

      -In the introduction, the authors state: 'Although primary and secondary sprouts appear morphologically similar, with tip and stalk cells' - Please provide the reference that supports the claim that secondary sprouts have tip-stalk cells morphology/organisation.

      Although many studies have investigated primary and secondary sprouting, identifying both shared as well as distinct molecular regulation, and show morphological details that are apparently similar, a formal claim that secondary sprouts show tip and stalk cell identities and behaviour is hard to find. Given that this is not relevant for the central findings of the work, we modified the sentence and added a reference “Although primary and secondary sprouts appear morphologically similar, with tip and stalk cells” (Sumio Isogai et al., 2003)…” See page 2.

      We also updated the discussion for consistency: “Although the cellular mechanisms of primary and secondary sprouting in zebrafish appear very similar, with tip cell selection and guided migration and stalk cell proliferation, secondary sprouting utilises alternative signalling pathways and entails a unique specification step that establishes both venous ISVs and lymphatic structures.” (see page 15)

      -The authors refer the increased cell division phenotypes observed in the movies, however, the movie files have not been available to the reviewers.

      We will provide the movies.

      Reviewer #2 (Significance (Required)):

      This is an important study as uncovering the mechanistic details of angiogenic and lymphangiogenic negative regulators is of high value with the potential for therapeutic developments. To date, Vash-1 has been only studied in the context of tumour angiogenesis, vasculature in diabetic nephropathy and pulmonary arterial hypertension, and it remains unclear what is its role during development and how does it regulate vascular network formation. The tyrosination status of microtubule in endothelial cells is understudied. This study revealed, previously uncharacterised detyrosinated microtubules in endothelial cells in vivo. And further dissects how this process might be regulated, brings unique insights into the vascular biology field and beyond. Thus, delving into the cell biological mechanism such as microtubule dynamics and modification in vivo in embryo context is a significant step forward in setting new standards in the field.

      I am developmental biologist who has experience in model organisms such as zebrafish and mouse. The main focus of my work is on developmental angiogenesis and lymphangiogenesis.

      REFEREES CROSS-COMMENTING

      After reading the other reviews comments, it seems that we all agree that this study is of high value to vascular biology field and beyond bringing novel findings.

      Importantly the reviewers' comments are in line with each other and have identified several commonalities that should be addressed. Such as: Further validation of Morpholinos, or using alternative methods to replicate the findings. additional evidence that the observed phenotypes are primary due to vash-1 requirement within EC, and not due to the secondary effect in other cells such as CXCR4/SDF1 system and SVEP1, neurons or general delay of the embryos Further evidence of for VASH expression pattern the number of embryos used in the experiments, and how the data is represented.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      Vasohibin-1 (Vash-1) is known to detyrosinate microtubules (MTs) and limit angiogenesis. Using in vivo live imaging and whole mount immunofluorescence staining of zebrafish trunk vasculature, Bastos de Oliveira et al. show that the MT detyrosination role of Vash-1 is conserved in zebrafish and that Vash-1 is essential for limiting venous sprouting and subsequent formation of lymphatics. Their findings suggest a role for MT detyrosination in lympho-venous cell specification.

      **Major comments:**

      1 . The authors claim that Vash-1 regulates secondary sprouting and lymphangiogenesis by detyrosinating MTs. However, no direct evidence of this link is provided in the manuscript. The authors only separately show that knockdown of vash-1 affects MT detyrosination and secondary sprouting and lymphangiogenesis. They have not shown a causative effect. The authors should therefore qualify the above stated claim as speculative. In other words, the authors should mention that their data only suggests that disruption of MT detyrosination is the underlying cause for aberrant secondary sprouting and lymphangiogenesis in vash-1 KD embryos.

      We agree with the reviewer about the lack of evidence to state that the disruption of microtubule detyrosination leads to aberrant secondary sprouting. Although we believe this is the most parsimonius explanation for the secondary sprouts behavioural defects as cell division is disturbed and microtubule detyrosination is implicated in cell division (Barisic et al., 2015), we want to make clear that our data currently only suggest a specific role of microtubule detyrosination in secondary sprouting. Examples of this are page 14 of the discussion „These results suggest that Vash-1-driven microtubule detyrosination limits excessive venous EC sprouting and proliferation during lympho-venous development in zebrafish.” as well as the abstract.

      We also corrected the sentence in the discussion (page 14): “In this study, we identified Vash-1-mediated microtubule detyrosination as a cellular mechanism as a novel regulator of EC sprouting from the PCV and the subsequent formation of lymphatic vessels in the zebrafish trunk.”

      To avoid any overstatement, we also propose the following title change: Vasohibin-1 mediated tubulin detyrosination selectively regulates secondary sprouting and lymphangiogenesis in the zebrafish trunk.

      As detailed in response to comment 2 below, we will however attempt to investigate the direct connection. Depending on the outcome, we will adapt conclusions and title accordingly.

      2 . In order to provide more compelling evidence for a direct relationship between MT tyrosination and lymphangiogenesis, the authors could try mutating the carboxypeptidase domain of vash-1 or overexpressing a dominant negative transcript (that contains a mutated carboxypeptidase domain). If this gives the same phenotypes as the vash-1 morphants, it would indicate that the carboxypeptidase activity of Vash-1 (in detyrosinating MTs) is responsible for limiting secondary sprouting and promoting specification of lymphatics. This suggested experiment is fairly realistic in terms of both time and resources. For example, since the authors already have the human vash-1 cDNA cloned, making a dominant negative transcript from this would take around two weeks, imaging and analysis of embryos injected with this mRNA would take another four weeks. Therefore, in total, the suggested experiment would take around 6 weeks. Although the alternative experiment, that is, making a carboxypeptidase domain mutant of vash-1 would be a better choice in terms of reproducibility and long-term use of a stable line, it would admittedly take a relatively larger amount of time. Therefore, the ultimate choice would depend on the authors.

      We will investigate this further by cloning and expressing a mutated vash-1 cDNA which translates a validated catalytically dead Vash-1 (Nieuwenhuis et al., 2017). However, this mutant has not been shown to function as dominant negative, so it is unclear whether it can be used as a dominant negative mutant.

      3 . Both the data and methods are presented in a way that ensures reproducibility. The statistical analysis is very well done, in that the authors were very prudent in their choice of statistical tests. However, in many figures and subfigures (Fig. 2B, H-J; Fig. 3G, J, K, N; Fig. 4J; Fig. 5J), the number of replicates was not mentioned and instead only the sample size was stated. Whether this was just an oversight or if it should be taken to mean that the analysis was performed on just one replicate is unclear. The authors need to clarify this aspect of their analysis. Further, In Fig. 2H-J, Fig. 3G,J, K, N and Fig. 4J, the total number of data points in control MO vs vash-1 KD seem to be quite different. In other words, there seem to be a lot more data points in one experimental condition than the other. Does this difference fall within the acceptable range? If the authors were to compare a similar number of data points between the two experimental conditions, would the results of the statistical analysis still be the same?

      We apreciate this comment and clarified the replicate numbers in the figure legends: Fig. 2B- 3 replicates (page 25), Fig. 2 H-J- quantification is 1 replicate (page 26), Fig. 2 D-G is representative of 3 replicates (page 25). Fig. 3 G,J,K,N – quantification is from 1 replicate (page 26), Fig. 3 B,C,E,F,H,I are representative of 2 experimental replicates (page 26). Fig. 4J – quantification is 1 replicate (page 27), Fig. 4 A-F is representative of 3 replicates (page 27). Fig. 5 J correspondes to 1 replicate (page 28).

      We plan to increase replicates and numbers in quantifications shown in Fig. 3 G,J,K,N and Fig. 5 J as they are relevant for the conclusions of the manuscript, and adapt the text.

      The quantifications of immunostaining signals are comparable between different samples of the same experiment but technically not easy accross different experiments, due to some variability of the immunostaining. However, the pattern we report in the quantifications and representative pictures is consistentely detected (reduced dTyr signal upon vash-1 KD in Fig 2 D-G; higher dTyr intensity in secondary rather than primary sprouts in Fig. 4 A-F). We added in the legend that the pictures of the embryos in these figures are representative of 3 biological replicates (see page 25 and 27).

      We recognise the unequal sample size in control and vash-1 KD groups in Fig. 2H-J, Fig. 3G,J, K, N and Fig. 4J. Generally, the vash-1 KD group shows more variance than the control group (see Fig. 3 J-N, 4J for example), hence the reason why we analysed a higher sample size.

      In the planned experiments (repeating quantifications of Fig. 3 J-N), we will analyse a similar number of embryos.

      We corrected the figure legend of 2 H-J on the number of ISVs - 108 ISVs from 7 embryos for control and 150 ISVs for vash-1 KD, from 9 embryos (see page 26).

      4 . The authors only provide KD data on the function of vash-1 using morpholinos. According to several recent guidelines concerning the use of morpholinos, this is not widely accepted in the zebrafish community as sufficient to provide robust insight into gene function. Please refer for example to the following publication: Guidelines for morpholino use in zebrafish, Stainier et al., PLOS Genetics, 2017. The generation of a vash-1 mutant is a necessary requirement for backing up morpholino KD data. Further, even though the authors state that embryos were selected on the pre-established criteria that they have normal morphology, beating heart, and flowing blood, certain morphological differences between control MO injected and vash-1 KD embryos could be observed in some figures. In Fig. 2D, F and Fig. 5A, B, E, F the vash-1 KD embryos seem smaller (extend of the dorso-ventral axis) than control MO injected embryos. The authors need to provide images showing the overall morphology of morpholino injected embryos and need to provide evidence that morpholino injections do not cause developmental delays.

      We agree that a mutant would be desirable to validate the phenotypic analysis of the morpholinos used, and would also allow for further analysis. However, this is not achievable within a reasonnable time frame, especially in the context of current work restrictions. We have added a sentence about the need to confirm the loss of function phenotype with vash-1 mutants in the discussion (see page 14).

      In addtion to the two morpholinos currently used to knockdown vash-1 expression, we will use an ATG morpholino to further investigate our observations and hypothesis regarding the role of vash-1 in lymphatic vessels formation. We will also validate it by westernblot and attempt to rescue it with mRNA.

      We added a supplementary figure with pictures and quantifications of antero-posterior (Sup. Figure 1 C) and dorso-ventral length (Sup. Figure 1 D) of the analysed control and vash-1 morpholino injected embryos‘ development at 24, 34, 52 and 4dpf which shows no significant developmental delay and morphological defect. There is some occurrence of curvature of the tail at 34-52 hpf.

      We added a sentence in the Methods section (pages 10) to clarify the morphant’s morphology and dosage-response curves.

      We observe a 1-2 hour developmental delay of both the control and the vash-1 KD embryos compared to uninjected wild-type embryos, which led us to chose the 52 hpf time point to investigate the PLs. In uninjected embryos they are usually developed by 48hpf (Hogan et al., 2009).

      Fig. 2 D shows a more anterior region of the zebrafish trunk than Fig. 2F (the tail has a smaller dorso-ventral length)- we will provide more comparable pictures from the same region.

      Fig. 5B is slightly tilted – we will provide a picture with the same orientation.

      Fig. 5 E and F have a similar length from dorsal aorta to the dorsal longitudinal anastomotic vessel. However, we appreciate a difference in the sub intestinal vascular plexus (SIVP), which is consistently underdeveloped in the vash-1 KD embryos.

      Figure 2- vash-1 deficient embryos show underdeveloped intestinal vascular system at 4 dpf.

      **Minor comments:**

      a. The authors should back their qPCR data for vash-1 expression (Figure 1) by standard mRNA in situ hybridization, given the large degree of variability in vash-1 expression. Do they observe a dynamic expression in the vasculature using this technique?

      We agree with the reviewer that an in situ hybridization would be beneficial to understand the expression pattern of vash-1 in wild type embryos. Accordingly, we will look at vash-1 expression by in situ hybridization in WT embryos.

      The number of nuclei per sprout in Fig. 3J does not correspond with the number of divisions per sprout presented in Fig. 3K. The authors observe one or two cell divisions per sprout in ctr MO injected embryos (Fig. 3K), however, Fig. 3J shows that the majority of ctr. sprouts contains only one cell. This is even more dramatic for vash-1 MO injected embryos, which can have up to four divisions, therefore should contain six cells. However, the maximum number of cells the authors report is three to four cells. How do these observations go together?

      We believe these quantifications are not contradicting. The number of endothelial nuclei was assessed just prior to the connection to the ISV and the cell division quantification was done in a time-lapse from the time of secondary sprout emergence until the resolution of the 3-way connection. It is expected that there are more cell divisions during a longer time frame, as cells migrate dorsally or ventrally out of the sprout.

      Fig. 5I and J have the same data points for control MO and vash-1 MO1. Does this mean that both graphs are from the same experiment? If so, the authors could combine the two graphs into one. If the two graphs are not from the same experiment, both would need to have independent controls.

      Fig 5 I and J are indeed from the same experiment. They are now combined into one graph (see Fig. 5 J).

      d. The percentage of somites with PLs in vash-1 MO1 injected embryos in Fig. 5I is half the value shown in Fig. 5C. Although this kind of variability might be expected in biological samples, perhaps the authors could briefly discuss the issue and its implications on reproducibility in the manuscript so as to have the readers be aware of it, especially since the rescue of the vash-1 morpholino phenotype back to 50% from 25% is the same value the authors observed in the vash-1 KD alone in Fig. 5C. Here the value is 50% for the morpholino injection.

      We added a sentence discussing the phenotypic variability in the discussion (see page 16), and we added a dosage response curve for the PLs (Sup. Figure 1 F), showing that embryos injected with the same amount of morpholino show variability in the percentage of somites with PLs at 52hpf. We added a more representative picture of PLs for vash-1 morphant in Fig. 5I ( Y-axis of Fig. 2H and 4J correspond to ratios, which have no units. Nontheless, we added AU/AU to these graphs to make it clearer. We added the bars in Fig. 5D.

      It would help to have an inference or conclusion at the end of each results section.

      We added one conclusion sentence per results section (see pages 11-14).

      Reviewer #3 (Significance (Required)):

      Conceptual: As per my knowledge, this is the first study that looks at microtubule modifications in the context of a vertebrate organism past the gastrulation stage, as opposed to similar studies that have been done in cell culture or invertebrates (S. cerevisiae, C. elegans and D. melanogaster). Moreover, this study is one of few that address a novel link between the cytoskeleton and the process of cell fate specification.

      Previous studies have separately shown that Vash-1 limits angiogenesis and detyrosinates MTs. The current study combines the two observations in the context of endothelial cells, and hypothesizes that perhaps the function of Vash-1 in limiting angiogenesis and at the same time promoting lymphatic development could be due to its role in MT modification at the molecular level and the consequent effect of this on cell division and/or fate specification at the cellular level. In short, this study aims to connect the long-standing gap in knowledge between cytoskeletal modifications and cell dynamics (in particular, division and specification) in a vertebrate organism. I therefore believe that the current study would be an exciting finding for research communities that study cytoskeletal influence on cellular dynamics and also those in the broad area of vascular biology.

      My field of expertise relates to vascular biology, specifically developmental angiogenesis and the behavior of endothelial cells in zebrafish.

      References

      Barisic, M., Silva E Sousa, R., Tripathy, S. K., Magiera, M. M., Zaytsev, A. V., Pereira, A. L., Janke, C., Grishchuk, E. L., & Maiato, H. (2015). Microtubule detyrosination guides chromosomes during mitosis. Science, 348(6236), 799–803. https://doi.org/10.1126/science.aaa5175

      Hogan, B. M., Bos, F. L., Bussmann, J., Witte, M., Chi, N. C., Duckers, H. J., & Schulte-Merker, S. (2009). Ccbe1 is required for embryonic lymphangiogenesis and venous sprouting. Nature Genetics, 41(4), 396–398. https://doi.org/10.1038/ng.321

      Isogai, S, Horiguchi, M., & Weinstein, B. M. (2001). The vascular anatomy of the developing zebrafish: an atlas of embryonic and early larval development. Developmental Biology, 230(2), 278–301. https://doi.org/10.1006/dbio.2000.9995

      Isogai, Sumio, Lawson, N. D., Torrealday, S., Horiguchi, M., & Weinstein, B. M. (2003). Angiogenic network formation in the developing vertebrate trunk. Development, 130(21), 5281–5290. https://doi.org/10.1242/dev.00733

      Kimura, H., Miyashita, H., Suzuki, Y., Kobayashi, M., Watanabe, K., Sonoda, H., Ohta, H., Fujiwara, T., Shimosegawa, T., & Sato, Y. (2009). Distinctive localization and opposed roles of vasohibin-1 and vasohibin-2 in the regulation of angiogenesis. Blood, 113(19), 4810–4818. https://doi.org/10.1182/blood-2008-07-170316

      Koltowska, K., Lagendijk, A. K., Pichol-Thievend, C., Fischer, J. C., Francois, M., Ober, E. A., Yap, A. S., & Hogan, B. M. (2015). Vegfc Regulates Bipotential Precursor Division and Prox1 Expression to Promote Lymphatic Identity in Zebrafish. Cell Reports, 13(9), 1828–1841. https://doi.org/10.1016/j.celrep.2015.10.055

      Liao, S., Rajendraprasad, G., Wang, N., Eibes, S., Gao, J., Yu, H., Wu, G., Tu, X., Huang, H., Barisic, M., & Xu, C. (2019). Molecular basis of vasohibins-mediated detyrosination and its impact on spindle function and mitosis. Cell Research, June. https://doi.org/10.1038/s41422-019-0187-y

      Lim, A. H., Suli, A., Yaniv, K., Weinstein, B., Li, D. Y., & Chien, C. Bin. (2011). Motoneurons are essential for vascular pathfinding. Development, 138(21), 4813. https://doi.org/10.1242/dev.075044

      Nicenboim, J., Malkinson, G., Lupo, T., Asaf, L., Sela, Y., Mayseless, O., Gibbs-Bar, L., Senderovich, N., Hashimshony, T., Shin, M., Jerafi-Vider, A., Avraham-Davidi, I., Krupalnik, V., Hofi, R., Almog, G., Astin, J. W., Golani, O., Ben-Dor, S., Crosier, P. S., … Yaniv, K. (2015). Lymphatic vessels arise from specialized angioblasts within a venous niche. Nature, 522(7554), 56–61. https://doi.org/10.1038/nature14425

      Nieuwenhuis, J., Adamopoulos, A., Bleijerveld, O. B., Mazouzi, A., Stickel, E., Celie, P., Altelaar, M., Knipscheer, P., Perrakis, A., Blomen, V. A., & Brummelkamp, T. R. (2017). Vasohibins encode tubulin detyrosinating activity. Science, 358(6369), 1453–1456. https://doi.org/10.1126/science.aao5676

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Vasohibin-1 (Vash-1) is known to detyrosinate microtubules (MTs) and limit angiogenesis. Using in vivo live imaging and whole mount immunofluorescence staining of zebrafish trunk vasculature, Bastos de Oliveira et al. show that the MT detyrosination role of Vash-1 is conserved in zebrafish and that Vash-1 is essential for limiting venous sprouting and subsequent formation of lymphatics. Their findings suggest a role for MT detyrosination in lympho-venous cell specification.

      Major comments:

      1 . The authors claim that Vash-1 regulates secondary sprouting and lymphangiogenesis by detyrosinating MTs. However, no direct evidence of this link is provided in the manuscript. The authors only separately show that knockdown of vash-1 affects MT detyrosination and secondary sprouting and lymphangiogenesis. They have not shown a causative effect. The authors should therefore qualify the above stated claim as speculative. In other words, the authors should mention that their data only suggests that disruption of MT detyrosination is the underlying cause for aberrant secondary sprouting and lymphangiogenesis in vash-1 KD embryos.

      2 . In order to provide more compelling evidence for a direct relationship between MT tyrosination and lymphangiogenesis, the authors could try mutating the carboxypeptidase domain of vash-1 or overexpressing a dominant negative transcript (that contains a mutated carboxypeptidase domain). If this gives the same phenotypes as the vash-1 morphants, it would indicate that the carboxypeptidase activity of Vash-1 (in detyrosinating MTs) is responsible for limiting secondary sprouting and promoting specification of lymphatics. This suggested experiment is fairly realistic in terms of both time and resources. For example, since the authors already have the human vash-1 cDNA cloned, making a dominant negative transcript from this would take around two weeks, imaging and analysis of embryos injected with this mRNA would take another four weeks. Therefore, in total, the suggested experiment would take around 6 weeks. Although the alternative experiment, that is, making a carboxypeptidase domain mutant of vash-1 would be a better choice in terms of reproducibility and long-term use of a stable line, it would admittedly take a relatively larger amount of time. Therefore, the ultimate choice would depend on the authors.

      3 . Both the data and methods are presented in a way that ensures reproducibility. The statistical analysis is very well done, in that the authors were very prudent in their choice of statistical tests. However, in many figures and subfigures (Fig. 2B, H-J; Fig. 3G, J, K, N; Fig. 4J; Fig. 5J), the number of replicates was not mentioned and instead only the sample size was stated. Whether this was just an oversight or if it should be taken to mean that the analysis was performed on just one replicate is unclear. The authors need to clarify this aspect of their analysis. Further, In Fig. 2H-J, Fig. 3G,J, K, N and Fig. 4J, the total number of data points in control MO vs vash-1 KD seem to be quite different. In other words, there seem to be a lot more data points in one experimental condition than the other. Does this difference fall within the acceptable range? If the authors were to compare a similar number of data points between the two experimental conditions, would the results of the statistical analysis still be the same?

      4 . The authors only provide KD data on the function of vash-1 using morpholinos. According to several recent guidelines concerning the use of morpholinos, this is not widely accepted in the zebrafish community as sufficient to provide robust insight into gene function. Please refer for example to the following publication: Guidelines for morpholino use in zebrafish, Stainier et al., PLOS Genetics, 2017. The generation of a vash-1 mutant is a necessary requirement for backing up morpholino KD data. Further, even though the authors state that embryos were selected on the pre-established criteria that they have normal morphology, beating heart, and flowing blood, certain morphological differences between control MO injected and vash-1 KD embryos could be observed in some figures. In Fig. 2D, F and Fig. 5A, B, E, F the vash-1 KD embryos seem smaller (extend of the dorso-ventral axis) than control MO injected embryos. The authors need to provide images showing the overall morphology of morpholino injected embryos and need to provide evidence that morpholino injections do not cause developmental delays.

      Minor comments:

      a. The authors should back their qPCR data for vash-1 expression (Figure 1) by standard mRNA in situ hybridization, given the large degree of variability in vash-1 expression. Do they observe a dynamic expression in the vasculature using this technique?

      b. The number of nuclei per sprout in Fig. 3J does not correspond with the number of divisions per sprout presented in Fig. 3K. The authors observe one or two cell divisions per sprout in ctr MO injected embryos (Fig. 3K), however, Fig. 3J shows that the majority of ctr. sprouts contains only one cell. This is even more dramatic for vash-1 MO injected embryos, which can have up to four divisions, therefore should contain six cells. However, the maximum number of cells the authors report is three to four cells. How do these observations go together?

      c. Fig. 5I and J have the same data points for control MO and vash-1 MO1. Does this mean that both graphs are from the same experiment? If so, the authors could combine the two graphs into one. If the two graphs are not from the same experiment, both would need to have independent controls.

      d. The percentage of somites with PLs in vash-1 MO1 injected embryos in Fig. 5I is half the value shown in Fig. 5C. Although this kind of variability might be expected in biological samples, perhaps the authors could briefly discuss the issue and its implications on reproducibility in the manuscript so as to have the readers be aware of it, especially since the rescue of the vash-1 morpholino phenotype back to 50% from 25% is the same value the authors observed in the vash-1 KD alone in Fig. 5C. Here the value is 50% for the morpholino injection.

      e. The Y-axis label is missing in Fig. 2H and Fig. 4J. Figure 5D lacks bars showing median and standard deviation.

      f. It would help to have an inference or conclusion at the end of each results section.

      Significance

      Conceptual: As per my knowledge, this is the first study that looks at microtubule modifications in the context of a vertebrate organism past the gastrulation stage, as opposed to similar studies that have been done in cell culture or invertebrates (S. cerevisiae, C. elegans and D. melanogaster). Moreover, this study is one of few that address a novel link between the cytoskeleton and the process of cell fate specification.

      Previous studies have separately shown that Vash-1 limits angiogenesis and detyrosinates MTs. The current study combines the two observations in the context of endothelial cells, and hypothesizes that perhaps the function of Vash-1 in limiting angiogenesis and at the same time promoting lymphatic development could be due to its role in MT modification at the molecular level and the consequent effect of this on cell division and/or fate specification at the cellular level. In short, this study aims to connect the long-standing gap in knowledge between cytoskeletal modifications and cell dynamics (in particular, division and specification) in a vertebrate organism. I therefore believe that the current study would be an exciting finding for research communities that study cytoskeletal influence on cellular dynamics and also those in the broad area of vascular biology.

      My field of expertise relates to vascular biology, specifically developmental angiogenesis and the behavior of endothelial cells in zebrafish.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary:

      The manuscript by Bastos de Oliveira et al. describes an important investigation of the endothelial tubulin detyrosination during vascular development. Namely, they found detyronised microtubules in secondary sprouts, which is absent in MO-vash-1 treated embryos. The authors use the vash-1 morpholino approach to uncover the developmental consequences of suppressed detyrosination in angiogenesis and lymphangiogenesis in vivo in zebrafish. By a combination of transgenic lines, immunohistochemistry and time-lapse imaging, Bastos de Oliveira et al., have found that Vash-1 is a negative regulator of secondary sprouting in zebrafish. The authors showed that in the absence of Vash-1 more cells are present in the secondary sprouts due to increased cell proliferation; however lymphatic vascular network fails to form. The current manuscript requires additional experimental evidence to support the conclusions. Please see below the major technical concerns and minor comments.

      Major comments:

      -This study is based on analysis of the phenotypes observed in embryos injected with vash-1 morpholino. The authors use two different types of splice morpholinos, perform rescue experiments with RNA, and validate one MO-vash-1 with western blot. Morpholinos are not trivial to work with, and the results are variable hence additional controls need to be included, as following the recommendation put together by the zebrafish community (Stainier et, al., Plos Genetics, 2017). As the severity of the phenotypes comparing MO1 with MO2 is different and MO-vash-1 embryos appear developmentally delayed (Figure 2D-F and 5E-F overall size seem to be affected), additional MO is required, for example, ATG-MO or generation of CRISPR mutant would be favourable. All the morpholino used need to be validated using an antibody, RT-PCR and qPCR. It is essential to carry out the rescue experiments for all the MO used in this study and following the guidelines. Including the dose-response curve, data would be informative.

      -In addition to EC, the levels of dTyr are lower in MO-vash-1 in neural tube and neurons spanning the trunk (Figrue 2 D-G'). These have been previously shown to be important for secondary sprouting. Is it possible that the observed phenotypes in the secondary sprouting are due to defects in these neurons?

      -Embryo number used in this study appears to be low especially in figure 3G, 5D, 5G, to conclude draw conclusions from these experiments, the number of embryos used should be higher than 20. Figure 4J please specify how many embryos were used.

      -The authors hypothesise that VASH acts in the sprouting endothelial cells, based on the Q-PCR in Figure 1. However, in this experiment all EC have been sorted thus this remains ambiguous in which cell types vash-1 is expressed. Please provide the expression pattern for vash-1 across the developmental stages the phenotypes are observed.

      -Throughout the manuscript the authors refer the lymphatic identity, however, there is no evidence in the paper that the identity status has been assessed. To support these claims Prox1 immunohistochemistry or analysis of prox1 expression in the reporter line would be appropriate.

      Minor comments:

      -The authors refer to the literature where overexpression of VASH suppresses the angiogenesis. As the RNA injections were used in rescue experiments, the data of vash-1 RNA injections into the wild-type embryos would be beneficial.

      -In figures 2J, 3J, 3K, 3N, 4J, 5C, 5D and 5G the N number was set for examples as the number of sprouts, the number of somites with TD, number of ISV. To strengthen the observation in the manuscript quantification of the sprouts, PL, vISVs and lymphatic phenotypes with N set as the number of embryos would be more informative. Indicating the number of embryos used, in the graphs, would be helpful.

      -In Figure 5A, B and D the authors quantify what they refer to as a lumenised connection between the vISVs and PL. In the control image (second star), a somewhat lumenised structure is present, clarification of how the scores were set is missing.

      -In Figure 3 E and F the authors show the excessive sprouting phenotype between controls and Mo-vash-1. The images presented are taking from different parts of the embryos (middle of the trunk vs plexus region), hampering the comparison between the two groups. The quantification of the phenotypes in both experimental groups should be in the same region of the embryo, as the local difference can occur. It is key to provide representative images to support these observations.

      -Figure 1D the vash-1 expression levels in EC seem very variable in this graph, therefore no conclusion can be drawn from this data, especially as the authors do not provide the p-values.

      -In the introduction, the authors state: 'Although primary and secondary sprouts appear morphologically similar, with tip and stalk cells' - Please provide the reference that supports the claim that secondary sprouts have tip-stalk cells morphology/organisation.

      -The authors refer the increased cell division phenotypes observed in the movies, however, the movie files have not been available to the reviewers.

      Significance

      This is an important study as uncovering the mechanistic details of angiogenic and lymphangiogenic negative regulators is of high value with the potential for therapeutic developments. To date, Vash-1 has been only studied in the context of tumour angiogenesis, vasculature in diabetic nephropathy and pulmonary arterial hypertension, and it remains unclear what is its role during development and how does it regulate vascular network formation. The tyrosination status of microtubule in endothelial cells is understudied. This study revealed, previously uncharacterised detyrosinated microtubules in endothelial cells in vivo. And further dissects how this process might be regulated, brings unique insights into the vascular biology field and beyond. Thus, delving into the cell biological mechanism such as microtubule dynamics and modification in vivo in embryo context is a significant step forward in setting new standards in the field.

      I am developmental biologist who has experience in model organisms such as zebrafish and mouse. The main focus of my work is on developmental angiogenesis and lymphangiogenesis.

      REFEREES CROSS-COMMENTING

      After reading the other reviews comments, it seems that we all agree that this study is of high value to vascular biology field and beyond bringing novel findings.

      Importantly the reviewers' comments are in line with each other and have identified several commonalities that should be addressed. Such as: Further validation of Morpholinos, or using alternative methods to replicate the findings. additional evidence that the observed phenotypes are primary due to vash-1 requirement within EC, and not due to the secondary effect in other cells such as CXCR4/SDF1 system and SVEP1, neurons or general delay of the embryos Further evidence of for VASH expression pattern the number of embryos used in the experiments, and how the data is represented.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      The manuscript entitled "Vasohibin-1 mediated tubulin detyrosination selectively regulates secondary sprouting and lymphangiogenesis in the zebrafish trunk" by de Oliveira investigates the function of the carboxylpeptidase Vasohibin during the formation of the zebrafish trunk vasculature and reports a requirement of Vasohibin for secondary sprout formation and in particular the formation the lymphatic vasculature.

      Having established the expression of Vasohibin in sorted ECs of 24 hpf embryos, the remaining study addresses the function of Vasohibin in this cell type. It is largely based on the use of a splice-site interfering morpholino. Particular commendable is the analysis, demonstrating that the KD of vash-1 indeed results in a significant reduction of detyrosination in endothelial tubulin. Findings in the vascular system then include: (i) the detection of increased division and hence supernumerous cells occurring selectively in 2nd sprouts from the PCV; (ii) an increased persistence of the initially formed 3 way connections with ISV and artery; (iii) reduced formation of parachordal lymphangioblasts and (iv) a reduced number of somites with a thoracic duct segment; (v) frequent formation of lumenized connections between PLs (where present) and ISV. To demonstrate specificity, the approach was repeated with a different morpholino and defects were partially rescued by MO-insensitive RNA.

      Possible additional and relevant information could include data on a vash-1 promotor mutant to independently verify the MO-based functional analysis. Mutants would also allow analysis of further development, are the defects leading to the demise of the fish or is a later regeneration and normalization of the lymphatic vasculature observed? In addition, are other lymphatic vessel beds like the cranial lymphatics affected? PLs have been demonstrated to be at least partially guided in their movement by the CXCR4/SDF1 system and SVEP1. Has the expression of these factors been tested in vash-1 KDs? With regards to the frequently observed connections of PLs and ISVs in vash-1 morphants, can the proposed lumen formation of these shunts be demonstrated e.g. by injection of Q-dots or microbeads into the circulation? Concerning the mechanisms of these defects, is it possible to analyse the asymmetric cell division leading to 2nd sprouts in greater detail? Is the same number or are more cells sprouting form PCV and can the fli1ep:EGFP-DCX cell line in fixed samples be used to identify the spindle orientation in dividing cells?

      Minor issues: Page 5, Mat & Meth, please spell out PTU at its first mention.

      Page 6 Mat & Meth, Secondary sprout and 3-way connection parameters: The number of nuclei was assessed in each secondary sprouts (del s, singular) just prior...

      Page 16, 8th line from bottom: Recent work demonstrated that a secondary sprout either contributes (add s) to remodelling a pre-existing ISV into a vein, or forms (add s)a PLs (Geudens et al., 2019).

      Page 25, Legend to Fig. 2D-G: "...G,G' shows quantification of dTyr signal upon vash-1 KD..." Fig2 G,G' show immunostaining rather than quantification of the dTyr signal, which is shown Fig. 2H-J

      Fig. 1D / Fig. 2H-J please increase weight of the error intervals and / or change colour for improved visibility

      Significance

      Taken together the manuscript is comprehensively written and the study provides a conclusive analysis of the MO-mediated KD of Vasohibin in zebrafish embryonic development presenting significant novel findings. Known was a generally inhibitory function of Vasohibin on vessel formation and its enzymatic activity as a carboxylpeptidase responsible for tubulin detyrosination, affecting spindle function and mitosis. New is the detailed analysis of the Vasohibin KD on zebrafish trunk vessel formation and the description of a selective impairment of 2nd sprout formation. The manuscript is of interest for vascular biologists.

      REFEREES CROSS-COMMENTING

      I fully concur with the comments of reviewer #2, all three reviews find that this study is of significant interest to the vascular biology community as the relevance of tubulin detyrosination for developmental angiogenesis has not been investigated. Also all three reviews highlight the potential limitations of the use of splice morpholinos (suggested alternatives include ATG morpholinos and CRIPR mutants), the requirement to provide further evidence for a endothelial cell autonomous defect and the need to clarify some of the data representation.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Response to Reviewers

      We are grateful to the Reviewers for their thoughtful and helpful assessment of our work. Below we include a point-by-point response to the Reviewers' critiques concerning the interpretation of our results and the power of our system to elucidate key dynamics of fission yeast homology-directed repair (HDR). We appreciate that the Reviewers judged our assay to be a valuable new tool for studying DSB repair in S. pombe. In general, the Reviewers also felt that our data provides new insights into homology search during HDR in fission yeast, including 1) that multiple DSB-donor encounters often precede repair and 2) that the activity of the helicase Rqh1, which dissolves strand invasion structures, alters the kinetics and efficiency of HDR in S. pombe. The Reviewers also raised several concerns with regards to 1) some technical aspects of the experimental approach, 2) the display of the data, and 3) the interpretation of the data. The Reviewers requested additional experiments to address the efficacy of our 5 minute observational time window and the rate of spontaneous damage in the Rqh1 null background, which we are able to provide in a resubmission. We will also clarify experimental details that the Reviewers found confusing in the original text. Lastly, the Reviewers highlighted minor needed figure adjustments that we will incorporate.

      Point-by-point Response:

      *Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      Vines et al adapted a system that has been used in S. cerevisiae to study the homology search and homologous recombination repair events by live cell imaging. The authors utilized a system they set up in a fission yeast strain that has a fluorescently tagged endonuclease induced DSB site and monitored RAD52 focus formation in both haploid and diploid cells. The main findings presented are that multiple strand invasion events occur during DSB repair and the role of Rqh1 in promoting these multiple events. For example, cells with Rqh1 loss either have a single strand invasion event that quickly leads to repair or a very long extensive repair time. Overall the results are intriguing with new insight into DSB repair being presented.*

      We appreciate the Reviewer’s recognition that our work provides new insights into homology-directed repair (HDR) in fission yeast.

      The manuscript would benefit from having another system to help to support or validate the key findings and/or the use of some mutants to help uncouple the different roles of Rad51 and/or Rqh1.

      While we agree with the Reviewer that using orthogonal approaches is always desirable, it is not clear what other experimental platform can address the dynamic events with single cell resolution that underlie our observations here; indeed, this was the motivation behind designing this new approach. However, we will provide additional, detailed context to support our findings in the revised manuscript that highlights how orthogonal experimental strategies (e.g. DSB repair outcome assays) already in the literature (e.g. Hope et al., PNAS, 2006) are consistent with our findings. Importantly, however, there is no other population-based system we are aware of that could demonstrate, for example, that Rqh1 shows two different behaviors in individual cells (repair failure and more rapid repair). See more in response to comment 7, below.

      \*Major comment:**

      1) In Figure 1C, and also Figure 2D, the RAD52 focus observed does not appear in the same location as the LacO cassette. I assume this is because of the way the images are cropped. It would be nice if the authors are saying that the RAD52 focus co-localizes with the inducible DSB location for this to be more readily apparent in the representative images. *

      Co-localization events, indicated with the yellow circles, are assessed within raw 3D data that is then flattened for representation in 2D in the figures. For Figure 1C, the two events in the example cell indeed overlap in 3D space. However, in Figure 2D (cells lacking Rad51) we do not observe any colocalization events in the example (and there are no time points annotated with yellow circles).

      2) In Figure 3A, the authors claim that the mean time to repair an endonuclease induced DSB is 50 min +/- 20 min. It is unclear whether or not this experiment is done in a diploid strain.

      We apologize if we were not clear. All experiments presented in the manuscript are carried out in diploid cells. What varies is whether there is a lac operator integrated at one copy of Chr II (all experiments except Fig. 2A) or on both copies (only Fig. 2A). This will be clarified in the revised text.

      3) In Figure 3, whether or not this experiment represents asynchronous cells can greatly influence the timing of DSB repair, as the cell cycle is a huge contributor to HDR repair.

      We agree with the Reviewer - the cell cycle has a critical influence on DSB repair mechanism. The diploid fission yeast in which we induced and observed DSBs are indeed asynchronous. However, in fission yeast, which spend over 80% of their cell in G2, we can assess cell cycle by morphology; cytokinesis coincides with the beginning of G2, which then persists until mitotic entry (which is also very obvious from the nuclear shape as visualized by Rad52-mCherry). Moreover, we previously found that HO endonuclease only induces DSBs during S phase (Leland et al., eLife, 2018). Given this, for individual cells we observe site-specific DSBs beginning in late S and early G2 phases and all of our analysis is done at this phase of the cell cycle. These observations are further validated by the observation that an HO-induced DSB undergoes very high rates of gene conversion in fission yeast (Prudden et al, EMBO J., 2003).

      4) In Figure 3D, since a major finding of the paper is that there are multiple invasion events, it would be nice to show some representative images of a few cells where multiple pairings occur.

      In Supplementary Figure 2A, we provided an example of a cell with multiple encounters between the DSB and donor. This will be more clearly highlighted in the revised text.

      5) It is known from Eric Greene's work that RAD51 mediated homology search can do multiple samplings of 8-9 nucleotide segments. Have the authors considered the area around the DSB site and how many potential pairing sites there might be in this region? Is it possible that having a LAC array with repeated segments might be influencing this the pairing since there would be multiple templates?

      We acknowledge that the homology of the region surrounding the DSB is important for faithful recognition of a homologous donor and that there could be many pairing sites surrounding our induced DSB after end resection. Such local sampling, however, would not be discernible due to the resolution of the light microscope (>0.2µm). We will address this noteworthy point during our discussion in the revision. Importantly, we placed the lacO array over 3 kb away from the locus where the HO recognition site is integrated on the homologous chromosome to attempt to avoid exactly the Reviewer’s concern.

      6) It would aid the reader if there were some picture schematics of what the authors think is occurring throughout the paper in the Figures. Since this is a results/discussion, this approach would be appropriate in lieu of a model figure at the end (which would also be very nice).

      We agree that diagrams would aid in communication of our hypotheses and interpretations, and these will be included in the revision.

      7) Since the multiple strand invasion events is a major finding of the paper, it is important to test the hypothesis that multiple strand invasion events are occurring a different way. A few ideas would be to examine Lorraine Symington's work on BIR where she observes multiple template switching events (Smith, CE, Llorente, B, Symington, LS (2007) Nature, 447(7140): 102-105) or something analogous to Wolf Heyer's recent study in Cell on template switching that the authors already cited. Another idea is to try a RAD51 mutant. For example, Doug Bishop's group has created a RAD51 mutant that uncouples the homology search from strand exchange, Rad51-II3A mutant (Cloud, V et al (2012) Science, 337(6099): 1222). Perhaps a mutant like this might be able to further support the key finding here.

      While our findings share parallels with the works raised by the Reviewer, we would argue that there is a fundamental difference between BIR-type assays and the one we present here, namely that we are visualizing multiple strand invasion events at the homologous chromosome in a normal, high fidelity repair event rather than multiple strand invasion events during BIR, which frequently result in translocations. Moreover, as the two chromosomes are perfectly homologous in our assay, we cannot leverage sequencing to reveal past strand invasion events that took place during HDR. We also cannot, unfortunately, access multiple simultaneous strand invasion events due to the diffraction limit of the light microscope. We concede that it would be informative to further dissect strand invasion using tools such as the Rad51-II3A mutant described in budding yeast in work referenced above by Reviewer #1 and developed in fission yeast by Sarah Lambert’s group (Ait Saada et al., Mol. Cell, 2017). However, with the present limitations on our laboratory access and the timeline necessary to carry out this experiment, we feel this is currently beyond the scope of this work.

      8) It is surprising that Rqh1 doesn't have a role in DNA end resection since this is a conserved function from budding yeast to man. Would similar results to what is observed in Figure 4 be observed in a Dna2 or Exo1 mutant?

      We acknowledge that Rqh1 orthologs in other organisms (BLM/Sgs1/etc.) have been shown to contribute to DSB end resection. However, previous work from our group indicates that Rqh1 is entirely dispensable for long-range resection in fission yeast (Leland et al., eLife, 2018). Interestingly, in this work we also demonstrated that it is only upon loss of either the 53BP1/Rad9 orthologue Crb2 or Rev7 that Rqh1 is able to compensate for loss of Exo1. It remains unclear whether this is a peculiarity of fission yeast (perhaps because they rely heavily on HR due to extensive time in G2) or if it is a direct consequence of the long G2 itself. Regardless, we demonstrated that cells lacking Exo1 cannot generate sufficient ssDNA tracts to load visualizable Rad52-mCherry (Leland et al., eLife, 2018). Given this, we cannot address this genetic background in this assay. The essential role for Dna2 in replication has also precluded its analysis.

      \*Minor comment:**

      1) As mentioned in the first line of the abstract, HDR is generally considered error-free as opposed to a pathway that "can be" error-free. *

      We acknowledge that HDR (and more specifically HR) is often error-free, but there are notable exceptions such as when a non-homologous donor is utilized for repair or when the polymerases engaged during repair incorporate errors (work from Haber and colleagues). We will expand and clarify this sentence in the revision.

      2) In Figure 2D, it is unclear whether this experiment is done in diploid cells. The rest of the figure is in diploid cells but two LacO cassette are not present past the first frame. Please clarify in the legend and/or figure panel. As mentioned above, this is also confusing in Figure 3.

      As above, we monitored repair events in diploid cells only – this will be clarified in the revised text.

      *Reviewer #1 (Significance (Required)):

      The most important advancement in this paper is that multiple strand invasion events occur during homologous recombination and the role of the Rqh1 in this process. Rqh1 is important protein whose mutation is implicated in human disease such as Bloom syndrome and cancer. In addition, misregulation of double-strand break repair and particularly of Rad51 is associate with cancer. Therefore, understanding the basic mechanisms of how Rad51 mediates double-strand break repair and the role of Rqh1 in this process is critical for understanding fundamental aspects of cancer development. * We appreciate the Reviewer’s assessment of the impact of this work.

      *Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      In this study, Vines et al developed a microscopy-based assay to determine the kinetics of a site-specific interhomolog repair event, in living fission yeast cells. They detect efficient homology search and homology-directed repair in the system. They also observe that repair is likely to involve multiple site-specific and Rad51-dependent co-localization events between the DSB and donor sequence, suggesting that efficient inter-homologue repair involves multiple strand invasion events. Loss of the RecQ helicase Rqh1 leads to repair through a single strand invasion event. However, failure to repair is more frequent in rqh1 mutants, which could reflect increased strand invasion at non-homologous sites.

      Overall, I find the approach to investigate homology search and homology-directed repair using live cell imaging interesting and potentially very informative. The ability to observe the process in living cells, and with high temporal resolution, complements a variety of previous studies that employ more indirect approaches to invoke similar models. In particular, previous work by the Heyer, Lichten and Hunter laboratories, in budding yeast, has established that Sgs1 promotes non-crossover recombination by acting as a quality control in the maturation of HR intermediates. In this sense, while newly described here for fission yeast, it is not unexpected that homology-directed repair involves multiple strand invasion cycles. In my opinion, the strength of the work is the method/approach, rather than the specific conclusions made (even though I think that it is important to know how fission yeast cells perform homology search).*

      We thank the Reviewer for their appreciation of the value that cell biology can bring to the study of homology-directed repair. We wholeheartedly agree that this work is consistent with prior work on Sgs1. With regards to multiple strand invasion cycles, while we agree that there may be many in the field who could be unsurprised by this result, we would argue that 1) demonstrating this by direct visualization of individual DNA repair invents has clear inherent value and 2) many studying homology search itself (or who have modeled homology search in silico, for example) do not incorporate multiple strand invasion cycles in their thinking. Thus, we would argue that this work goes beyond a technical feat and will have impact beyond the approach.

      *However, for the reasons detailed below, my general impression is that it isn't clear how robust the method is at delivering unambiguous information on the important questions asked:

      1) The authors state that they have developed a system to monitor the 'dynamics and kinetics' of an engineered, inter-homologue repair event. With this in mind, I was expecting a more detailed exploration of the process of homology search. For example, what happens at shorter time scales? Is it possible that by imaging at every 5 minutes many of the events are missed? Could the authors be missing very transient events (especially in rqh1 mutants) by using an inappropriate time scale? *

      We acknowledge that it would be ideal to observe DSB repair across a range of time scales in our system. For practical reasons we found it most valuable to choose the 5 minute time window since it was most amenable to observing the entire course of repair as often as possible in an asynchronous cell population (see our response to Reviewer #1’s comment 3 above) while mitigating photobleaching. However, we recognize that we sacrificed time resolution between acquired frames in order to do this. Like the Reviewer, we were also concerned that we were missing transient events due to an inappropriate timescale.

      To address this, we acquired additional data in WT cells with greater time resolution with a focus on encounter frequency rather than time to repair (as the overall length of the usable movie that we can obtain is shorter). When imaging WT cells with a site-specific DSB at 2 minute intervals (2.5 times more frequently), we do observe a shift (of ~ 1 encounter per 30 minute window) toward more colocalization events with the donor sequence. We also observe, however, that more sampling leads to an increase in random encounters as revealed by similar analysis of the two lacO control strain as described in the manuscript. These data will be included in the revision and suggest that we may be missing some transient encounter events while using 5 minute time points. As noted by Reviewer #2, this could account for repair in the subset of WT and Rqh1-null cells in which we observed no encounters. We will acknowledge these caveats in the revision but would argue that our data support the conclusion that loss of Rqh1 decreases the number and/or lifetime of strand invasion events.

      2) Another point relates to the Rad52 signal/foci, which is central to the study. While it is clear to me what the authors consider to be a focus of Rad52, I am not sure how to interpret what has happens when Rad52 is as enriched throughout the entire nucleus as it is in the repair focus in the still before. For example, Figure 1C, 40 min vs 45 min. How do the authors interpret what is being visualised? Similarly, is the level of colocalization at 90 min really reflecting a specific enrichment of Rad52 at the DSB site? Much more of the Rad52 signal is away from the DSB. In other words, are quantitative criteria being used to assign colocalization events?

      As described in our Methods and the text, we used specific criteria to define 1) whether DSBs are site-specific and 2) whether they are colocalized with the donor site. In the images indicated as “contrast adjusted” we have scaled each panel time point individually with respect to the pixel intensities (that is, the least and most intense pixels have been set the same value for each). This strategy allows us to convey relatively dim Rad52-mCherry foci, particularly early after DSB end resection. A consequence of this is that the apparent background for panels in which there is not a strong Rad52-mCherry focus will appear higher, while the background will appear relatively less at time points with a strong Rad52-mCherry focus. For this reason we also present the raw image (found above). It is important to emphasize that when we are applying co-localization criteria, we do so within a 3D stack of images to ensure that the Rad52-mCherry signal and lacO array GFP signal coincide. In 2D representation, however, we understand that this may appear less clear.

      In the particular case of the colocalization in Figure 1C at 90 minutes that the Reviewer points out, it is more evident in the 3-D Z stacks that the surrounding mCherry signal apart from the colocalization with the lacO array is due to inhomogeneity in the background signal. Another contribution is that the lacO array signal often becomes delocalized during colocalization events (as evident in that 90 minute time point). Although this is an interesting observation, we are still investigating what activity may explain this response. We will address the caveats of our colocalization analysis more fully in the revision.

      3) In the system described here, Rad52 foci form in only ~15% of cells. I think it would be important to rationalise this low number in the manuscript. Moreover, G2 Rad52 foci still form at considerable rates in cells without HO. I think it would be important that the authors provide some explanation on what this might reflect.

      There are several considerations that we believe contribute to this observation, which we also documented previously in haploid cells (Leland et al., eLife, 2018). First and foremost, this assay is quite different from endpoint assays that involve induction of HO nuclease because we analyze only those events that happen immediately after additional of uracil to elevate HO endonuclease expression under the control of the urg1 promoter. Combined with the efficient repair of any DSB induced by leaky HO expression (taking less than an hour according to our data), we likely miss events that have already taken place or would take place later in other assay systems. Lastly, it is established that nucleosomes can prevent HO cleavage in its intrinsic role in budding yeast (Laurenson and Rine, Microbiol. Rev., 1992; Haber, Ann. Rev. Genet., 1998); we cannot rule out that cleavage at this particular site is less efficient due to intrinsic nucleosome stability. With respect to spontaneous DNA damage, most of this is short-lived and occurs in S-phase, likely due to replication stress, although we occasionally observe long-lived Rad52 foci in a sub-population of cells – this is in line with previous publications (Coulon et al., MBoC, 2006; Lorenz et al. Mol. Cell Biol., 2009; Sanchez et al., Mol. Cell Biol., 2012; Schonbrun et al., J Biol. Chem., 2013). We will provide a greater explanation of the observed induction rate in the revision.

      \*Other issues to consider:**

      4) In Figure 2D, the overlay does not show any green. It is possible that the green channel was not overlaid with the pink? *

      We apologize for this error and very much appreciate the Reviewer noticing that it is missing from the merged image. This will be corrected.

      5) In Figure 2D, the unadjusted images for Rad52 are very sharp. Did the authors perform contrast adjustment in the top panels? If so, this should be indicated. My current impression is that the data was duplicated by mistake.

      The Rad52-mCherry data in Figure 2D was labelled correctly and not duplicated. Because cells lacking Rad51 accumulate extensively resected DSBs (and therefore abnormally high levels of Rad52 loading), the intensity of Rad52-mCherry is very high. For simplicity we will remove the contrast-adjusted Rad52-mCherry images in the revision.

      6) I don't understand why is the time since nuclear division different is every single figure. For simplicity, it would be much better to start every figure at T=0.

      We agree with the Reviewer. In the revision we will normalize all kymographs to begin at t=0 with the exception of the Fig. S1D (where we are visualizing the subsequent division).

      *Reviewer #2 (Significance (Required)):

      see above. Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      In this manuscript, the authors describe a system to monitor an inducible site-specific double-strand break (DSB) and the undamaged homologous locus during homology-directed repair in S. pombe cells. The authors show that the Rad52 focus on the induced DSB is more persistent than spontaneous Rad52 foci that form throughout the cell cycle. The persistent Rad52 focus intermittently colocalizes with the donor sequence labeled with LacI-GFP, reflecting multiple strand invasion events, and this colocalization requires the Rad51 recombinase. The authors report that the time to repair is dependent on the number of strand invasion events (colocalization of Rad52 and homolog), and that the initial distance between the induced DSB and the homolog predicts the time to their first contact, but does not predict the time to repair. Lastly, the authors claim that repair in rqh1Δ cells is bimodal, either failing to repair within the experimental time frame, or being more efficient than WT cells (which often involves a single colocalization event).

      **These claims are supported by the data:**

      1) Rad52 focus on the induced DSB is more persistent than spontaneous Rad52 foci that form throughout the cell cycle.

      2) Multiple colocalization events between Rad52 focus and the donor sequence are frequent, and this colocalization is dependent on Rad51, which reflects multiple strand invasion events.

      3) rqh1Δ cells have a lower rate of productive repair compared to WT cells. *

      The key concern I have for this section is the noise in Rad52 images. For example, in Fig. 1C at 15 minutes, it looks like there is a Rad52 focus both before and after adjustment but the time point is labeled as not having a Rad52 focus. Conversely, in Fig. 2D at 60 minutes, it looks like there isn't a Rad52 focus but the time point is labeled as having a Rad52 focus. How did the authors determine the presence of a Rad52 focus? Additionally, it is difficult to assess colocalization of Rad52 and LacI-GFP in merged images (hard to see Rad52 focus in Fig. 1C merged and LacI-GFP in Fig. 2D merged).

      The criteria that we established to indicate a Rad52-mCherry focus (as annotated by a pink circle and as explained in the Methods) is that it persists for at least three frames (>15 minutes). This was chosen because it is a characteristic of the HO-induced DSB but not of spontaneous DNA damage that occurs frequently during S-phase. Indeed, the numerous, small, and short-lived foci at the 15 minute time point in Fig. 1C referred to by the Reviewer occurs just 15 minutes after nuclear division and is perfectly characteristic of replication stress that is independent of HO endonuclease expression. Thus, the pink circles indicate a specific type of Rad52-mCherry focus that is relevant for the assay. We agree that the Rad52-mCherry focus in Fig. 2D at ~60 minutes is poorly visualized in the flattened image, but would like to emphasize that we assess the foci in the true 3D volume. With regards to the merged images, we will adjust the individual signals to make it easier for the reader to assess colocalization in the revision.

      \*These claims are supported by weak data:**

      1) The initial distance between the induced DSB and donor sequence predicts the time to their first physical encounter (Line 60). *

      We agree with the Reviewer that our word choice (“predicts”) suggests a stronger relationship than is supported by the data. However, we also argue that there is nonetheless a meaningful correlation. We believe this is an important point to make because it supports prior work in budding yeast suggesting that relative position affects donor choice preference. We will edit this language in the revised text.

      2) Repair efficiency is dictated by the number of strand invasion events (Line 61-62). Figures 3E and 3F technically have positive correlations that support the authors' claims but there is a lot of noise. I think the data needs to be more robust, especially considering the strong wording used to describe the data. A minor comment on Fig. 3F: why is there a data point with 3.5 encounters?

      Again, we agree with the Reviewer that our word choice (“dictate”) is too strong given the data and we will edit the text accordingly. We thank the reviewer for noticing the error in Fig. 3F, which will be corrected.

      \*These claims are not supported by the data:**

      1) In the absence of Rqh1, successful repair requires a single strand invasion event (Line 63). *

      We acknowledge that this is too strong a claim to make based on our data and will amend this language in the revision text. Specifically, and as outlined in our response to Reviewer #2 with regards to our imaging frequency, we will revise the manuscript to state that cells lacking Rqh1 are more likely to repair without a visualized colocalization event and/or they possess shorter lived strand invasion events. Importantly, repair outcome assays indicate that cells lacking Rqh1 display elevated gene conversion rates rather than non-HDR-mediated repair (Hope et al., PNAS, 2006). Thus, we do not expect that the lack of colocalization reflects NHEJ but rather our inability to “catch” the colocalization event with the temporal resolution we can achieve.

      2) rqh1Δ cells that complete repair are more efficient than WT cells and often involve a single colocalization event (Line 178-179).

      As for the above, we agree that our claim that rqh1Δ cells “often” involve a single colocalization event is too strong a claim based on our data. We will amend this language in the revised text.

      Fig. 4A shows an example of a rqh1Δ cell with productive repair but without any colocalization with the homolog, which contradicts the statement that successful repair requires a single strand invasion event in the absence of Rqh1. If the authors interpreted the single continuous presence of Rad52 focus during time-lapse as evidence of a single strand invasion event, then it would nullify using multiple colocalization events as evidence for multiple strand invasion events. In other words, the data in Fig. 3D that clearly displays multiple colocalization events in individual cells during repair can no longer be evidence of multiple strand invasion events since those cells all had one continuous presence of Rad52 focus.

      We believe that we understand the confusion that the Reviewer is articulating in their comment and apologize that we have not been clearer in explaining our interpretation. For this site-specific DSB to be repaired, we expect that it must either 1) engage with the homologous chromosome to be repaired by HR/BIR or 2) be repaired through an alternative pathway – at this non-repetitive, resected locus this would likely be a microhomology-mediated (alt-) NHEJ mechanism. However, prior analysis of repair outcome in a model of interhomologue repair in the absence of Rqh1 (Hope et al., PNAS, 2006) demonstrates an increase in cross-over HR events rather than end joining events, arguing that interhomologue HR still dominates (and with increased CO to NCO frequency). We interpret the continuous presence of a Rad52 focus to only reflect that a DSB has been subjected to resection and has not yet been repaired. Taking these two points together, within the lifetime of a Rad52-loaded DSB it can either 1) never colocalize with the donor sequence and fail to repair (as in cells lacking Rad51, Fig. 2D-F) or 2) undergo strand invasion (and therefore colocalization) at least one time (but possibly multiple times) to allow for HDR to occur. However, we agree (and must clarify in the revision) that we often infer that at least one strand invasion event has taken place to support successful HDR when we do not capture the event at our experimental time resolution. Based on the additional data at shorter timescales that we will add to the revised manuscript (as outlined in the response to Reviewer 2, point 1), which demonstrates that we may in some cases be undercounting relevant colocalization events that are too brief to be accurately captured with 5 minute time resolution, we think the most parsimonious explanation is that cells lacking Rqh1 spend less time with the DSB and donor sequence colocalized prior to repair. We agree with the Reviewer, however, that we cannot say whether this reflects a shorter duration of interactions and/or a fewer number of interactions. We will therefore revise the manuscript to acknowledge this point.

      Regarding the second claim, I think Fig. 4D only shows rqh1Δ cells with successful repair (since the longest repair time is 55 minutes, but it is not clear from the figure legend). It is not shown how many colocalization events these cells had in Fig. 4D, but there are 16 cells in Fig. 4D while there are only 2 cells with a single encounter (shown in Fig. 4F). With these numbers, it seems like rqh1Δ cells that complete repair are more efficient than WT cells but only few of these cells involve a single colocalization event.

      The Reviewer is correct, Figure 4D does indeed show only rqh1Δ cells with the site-specific DSB that successfully repair – this will be clarified in the revision text. As described above in our response to Reviewer #2’s comment 1, it may be that we are missing colocalization events in rqh1Δ DSB cells. However, we would argue that our data do support that, for cells lacking Rqh1 that execute repair, there are fewer and/or shorter-lived colocalization events. Again, this will be made clear in the revision.

      Also, how often do Rad52 foci form spontaneously in rqh1Δ cells and what is the duration? This data was provided for WT but not for rqh1Δ.

      We agree that increased levels of genome instability (and therefore Rad52 foci) would present an issue – and indeed this has prevented us from analyzing some genetic backgrounds. However, we do not observe a significant increase in spontaneous Rad52-mCherry focus formation in rqh1Δ cells. This data will be included in the revision.

      All of the data would have been more supported if the homologous chromosome would have been tagged. Such a configuration would really have helped the interpretation of the rqh1∆ data.

      We agree that in theory it would be advantageous to have both copies of the chromosome tagged. Indeed, we attempted to leverage a different version of this experimental system with lacO arrays on both copies while inducing a DSB. However, the complexity of monitoring (and keeping the identity clear) for the two copies presented major challenges. Better would be two distinct arrays – an approach that has been used in budding yeast. However, to date many groups, including ours, have been unable to get TetO-TetR arrays to perform well in fission yeast.

      * Reviewer #3 (Significance (Required)):

      The significance of this work is the conceptual advance in the field of DNA repair. Homology search is an important process in homology-directed repair and is not fully understood. This study reports time-lapse data on the interaction between a DSB and its donor template during repair and provides insight into the kinetics of homology search. The audience for this manuscript is the field of DNA repair, and to a lesser extent, field of live-cell imaging.*


    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      In this manuscript, the authors describe a system to monitor an inducible site-specific double-strand break (DSB) and the undamaged homologous locus during homology-directed repair in S. pombe cells. The authors show that the Rad52 focus on the induced DSB is more persistent than spontaneous Rad52 foci that form throughout the cell cycle. The persistent Rad52 focus intermittently colocalizes with the donor sequence labeled with LacI-GFP, reflecting multiple strand invasion events, and this colocalization requires the Rad51 recombinase. The authors report that the time to repair is dependent on the number of strand invasion events (colocalization of Rad52 and homolog), and that the initial distance between the induced DSB and the homolog predicts the time to their first contact, but does not predict the time to repair. Lastly, the authors claim that repair in rqh1Δ cells is bimodal, either failing to repair within the experimental time frame, or being more efficient than WT cells (which often involves a single colocalization event).

      These claims are supported by the data:

      1) Rad52 focus on the induced DSB is more persistent than spontaneous Rad52 foci that form throughout the cell cycle.

      2) Multiple colocalization events between Rad52 focus and the donor sequence are frequent, and this colocalization is dependent on Rad51, which reflects multiple strand invasion events.

      3) rqh1Δ cells have a lower rate of productive repair compared to WT cells. The key concern I have for this section is the noise in Rad52 images. For example, in Fig. 1C at 15 minutes, it looks like there is a Rad52 focus both before and after adjustment but the time point is labeled as not having a Rad52 focus. Conversely, in Fig. 2D at 60 minutes, it looks like there isn't a Rad52 focus but the time point is labeled as having a Rad52 focus. How did the authors determine the presence of a Rad52 focus? Additionally, it is difficult to assess colocalization of Rad52 and LacI-GFP in merged images (hard to see Rad52 focus in Fig. 1C merged and LacI-GFP in Fig. 2D merged).

      These claims are supported by weak data:

      1) The initial distance between the induced DSB and donor sequence predicts the time to their first physical encounter (Line 60).

      2) Repair efficiency is dictated by the number of strand invasion events (Line 61-62). Figures 3E and 3F technically have positive correlations that support the authors' claims but there is a lot of noise. I think the data needs to be more robust, especially considering the strong wording used to describe the data. A minor comment on Fig. 3F: why is there a data point with 3.5 encounters?

      These claims are not supported by the data:

      1) In the absence of Rqh1, successful repair requires a single strand invasion event (Line 63).

      2) rqh1Δ cells that complete repair are more efficient than WT cells and often involve a single colocalization event (Line 178-179). Fig. 4A shows an example of a rqh1Δ cell with productive repair but without any colocalization with the homolog, which contradicts the statement that successful repair requires a single strand invasion event in the absence of Rqh1. If the authors interpreted the single continuous presence of Rad52 focus during time-lapse as evidence of a single strand invasion event, then it would nullify using multiple colocalization events as evidence for multiple strand invasion events. In other words, the data in Fig. 3D that clearly displays multiple colocalization events in individual cells during repair can no longer be evidence of multiple strand invasion events since those cells all had one continuous presence of Rad52 focus. Regarding the second claim, I think Fig. 4D only shows rqh1Δ cells with successful repair (since the longest repair time is 55 minutes, but it is not clear from the figure legend). It is not shown how many colocalization events these cells had in Fig. 4D, but there are 16 cells in Fig. 4D while there are only 2 cells with a single encounter (shown in Fig. 4F). With these numbers, it seems like rqh1Δ cells that complete repair are more efficient than WT cells but only few of these cells involve a single colocalization event. Also, how often do Rad52 foci form spontaneously in rqh1Δ cells and what is the duration? This data was provided for WT but not for rqh1Δ. All of the data would have been more supported if the homologous chromosome would have been tagged. Such a configuration would really have helped the interpretation of the rqh1∆ data.

      Significance

      The significance of this work is the conceptual advance in the field of DNA repair. Homology search is an important process in homology-directed repair and is not fully understood. This study reports time-lapse data on the interaction between a DSB and its donor template during repair and provides insight into the kinetics of homology search. The audience for this manuscript is the field of DNA repair, and to a lesser extent, field of live-cell imaging.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      In this study, Vines et al developed a microscopy-based assay to determine the kinetics of a site-specific interhomolog repair event, in living fission yeast cells. They detect efficient homology search and homology-directed repair in the system. They also observe that repair is likely to involve multiple site-specific and Rad51-dependent co-localization events between the DSB and donor sequence, suggesting that efficient inter-homologue repair involves multiple strand invasion events. Loss of the RecQ helicase Rqh1 leads to repair through a single strand invasion event. However, failure to repair is more frequent in rqh1 mutants, which could reflect increased strand invasion at non-homologous sites.

      Overall, I find the approach to investigate homology search and homology-directed repair using live cell imaging interesting and potentially very informative. The ability to observe the process in living cells, and with high temporal resolution, complements a variety of previous studies that employ more indirect approaches to invoke similar models. In particular, previous work by the Heyer, Lichten and Hunter laboratories, in budding yeast, has established that Sgs1 promotes non-crossover recombination by acting as a quality control in the maturation of HR intermediates. In this sense, while newly described here for fission yeast, it is not unexpected that homology-directed repair involves multiple strand invasion cycles. In my opinion, the strength of the work is the method/approach, rather than the specific conclusions made (even though I think that it is important to know how fission yeast cells perform homology search). However, for the reasons detailed below, my general impression is that it isn't clear how robust the method is at delivering unambiguous information on the important questions asked:

      1) The authors state that they have developed a system to monitor the 'dynamics and kinetics' of an engineered, inter-homologue repair event. With this in mind, I was expecting a more detailed exploration of the process of homology search. For example, what happens at shorter time scales? Is it possible that by imaging at every 5 minutes many of the events are missed? Could the authors be missing very transient events (especially in rqh1 mutants) by using an inappropriate time scale?

      2) Another point relates to the Rad52 signal/foci, which is central to the study. While it is clear to me what the authors consider to be a focus of Rad52, I am not sure how to interpret what has happens when Rad52 is as enriched throughout the entire nucleus as it is in the repair focus in the still before. For example, Figure 1C, 40 min vs 45 min. How do the authors interpret what is being visualised? Similarly, is the level of colocalization at 90 min really reflecting a specific enrichment of Rad52 at the DSB site? Much more of the Rad52 signal is away from the DSB. In other words, are quantitative criteria being used to assign colocalization events?

      3) In the system described here, Rad52 foci form in only ~15% of cells. I think it would be important to rationalise this low number in the manuscript. Moreover, G2 Rad52 foci still form at considerable rates in cells without HO. I think it would be important that the authors provide some explanation on what this might reflect.

      Other issues to consider:

      4) In Figure 2D, the overlay does not show any green. It is possible that the green channel was not overlaid with the pink?

      5) In Figure 2D, the unadjusted images for Rad52 are very sharp. Did the authors perform contrast adjustment in the top panels? If so, this should be indicated. My current impression is that the data was duplicated by mistake.

      6) I don't understand why is the time since nuclear division different is every single figure. For simplicity, it would be much better to start every figure at T=0.

      Significance

      see above.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Vines et al adapted a system that has been used in S. cerevisiae to study the homology search and homologous recombination repair events by live cell imaging. The authors utilized a system they set up in a fission yeast strain that has a fluorescently tagged endonuclease induced DSB site and monitored RAD52 focus formation in both haploid and diploid cells. The main findings presented are that multiple strand invasion events occur during DSB repair and the role of Rqh1 in promoting these multiple events. For example, cells with Rqh1 loss either have a single strand invasion event that quickly leads to repair or a very long extensive repair time. Overall the results are intriguing with new insight into DSB repair being presented. The manuscript would benefit from having another system to help to support or validate the key findings and/or the use of some mutants to help uncouple the different roles of Rad51 and/or Rqh1.

      Major comment:

      1) In Figure 1C, and also Figure 2D, the RAD52 focus observed does not appear in the same location as the LacO cassette. I assume this is because of the way the images are cropped. It would be nice if the authors are saying that the RAD52 focus co-localizes with the inducible DSB location for this to be more readily apparent in the representative images.

      2) In Figure 3A, the authors claim that the mean time to repair an endonuclease induced DSB is 50 min +/- 20 min. It is unclear whether or not this experiment is done in a diploid strain.

      3) In Figure 3, whether or not this experiment represents asynchronous cells can greatly influence the timing of DSB repair, as the cell cycle is a huge contributor to HDR repair.

      4) In Figure 3D, since a major finding of the paper is that there are multiple invasion events, it would be nice to show some representative images of a few cells where multiple pairings occur.

      5) It is known from Eric Greene's work that RAD51 mediated homology search can do multiple samplings of 8-9 nucleotide segments. Have the authors considered the area around the DSB site and how many potential pairing sites there might be in this region? Is it possible that having a LAC array with repeated segments might be influencing this the pairing since there would be multiple templates?

      6) It would aid the reader if there were some picture schematics of what the authors think is occurring throughout the paper in the Figures. Since this is a results/discussion, this approach would be appropriate in lieu of a model figure at the end (which would also be very nice).

      7) Since the multiple strand invasion events is a major finding of the paper, it is important to test the hypothesis that multiple strand invasion events are occurring a different way. A few ideas would be to examine Lorraine Symington's work on BIR where she observes multiple template switching events (Smith, CE, Llorente, B, Symington, LS (2007) Nature, 447(7140): 102-105) or something analogous to Wolf Heyer's recent study in Cell on template switching that the authors already cited. Another idea is to try a RAD51 mutant. For example, Doug Bishop's group has created a RAD51 mutant that uncouples the homology search from strand exchange, Rad51-II3A mutant (Cloud, V et al (2012) Science, 337(6099): 1222). Perhaps a mutant like this might be able to further support the key finding here.

      8) It is surprising that Rqh1 doesn't have a role in DNA end resection since this is a conserved function from budding yeast to man. Would similar results to what is observed in Figure 4 be observed in a Dna2 or Exo1 mutant?

      Minor comment:

      1) As mentioned in the first line of the abstract, HDR is generally considered error-free as opposed to a pathway that "can be" error-free.

      2) In Figure 2D, it is unclear whether this experiment is done in diploid cells. The rest of the figure is in diploid cells but two LacO cassette are not present past the first frame. Please clarify in the legend and/or figure panel. As mentioned above, this is also confusing in Figure 3.

      Significance

      The most important advancement in this paper is that multiple strand invasion events occur during homologous recombination and the role of the Rqh1 in this process. Rqh1 is important protein whose mutation is implicated in human disease such as Bloom syndrome and cancer. In addition, misregulation of double-strand break repair and particularly of Rad51 is associate with cancer. Therefore, understanding the basic mechanisms of how Rad51 mediates double-strand break repair and the role of Rqh1 in this process is critical for understanding fundamental aspects of cancer development.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      We thank the reviewers for their comments and outline below how we plan to address them.


      Reviewer #1 (Evidence, reproducibility and clarity (Required)): **Summary:** Provide a short summary of the findings and key conclusions (including methodology and model system(s) where appropriate). The authors here describe a method to modify bacterial artificial chromosomes (BAC) harbouring gene loci from eukaryotes. When wanting to modify a BAC an antibiotic selection cassette is often included alongside the desired mutation/modification to increase the number of successful recombinants in E.coli. Traditionally, this is removed in a second recombination process to leave only the desired modification. The novelty in the procedure described herein is to add a synthetic intron consensus sequence around the selection cassette, which eliminates the need for the subsequent removal of the antibiotic cassette from the BAC before transfection into mammalian cells, saving time and resources. The technique is clever in its simplicity and appears to function for a number of gene loci. The authors validated the correct functioning of the modified BACs for a number of genes using three main assays - transcript level, protein level and localisation. **Major comments:** *Are the key conclusions convincing?* The conclusion that the method described generates functional modified BACs is valid. *Should the authors qualify some of their claims as preliminary or speculative, or remove them altogether?* While the method is successfully employed in this study, its efficiency is not quantified in relation to the state-of-the-art as described in the introduction. One assumes it would be more efficient, but this has not been tested empirically in the paper. Does the inclusion of the synthetic intron sequence have an effect on the efficiency of modifying BACs compared to a more typical two-step positive/negative antibiotic selection cassette? *

      • *

      This is a good point that we did not directly address. In general, the efficiency is similar to that of integrating any cassette with selectable marker, as has been published (Poser et al 2008), and therefore also higher than the two-step counterselection method, which requires such a cassette integration in the first step alone. We will include new data specifically addressing the efficiency of our new method (see specifics below)

      The functionality of this approach rests entirely on the ability of the target cell to correctly splice out the synthetic intron. The authors are aware of this potential problem as highlighted in the lines below, but do not make efforts to explicitly test splicing. On lines 224-225, the authors state "We cannot exclude that a small portion of synthetic introns within individual cells are misspliced". On lines 230-231 it is stated that "mis-spliced mRNAs are probably minimal and degraded by nonsense-mediated decay". On lines 215-217, the authors describe an "investigation of transgenic lines at the single-cell level" that suggests "the synthetic intron is correctly spliced out in all the cells of the population". How do the authors reach this conclusion? U2OS and HeLa cells are considered very "robust" and may not show detectable consequences when stressed with an increased level of nonsense-mediated decay. Further, many genes maintain a high level of expression that buffers them against small changes in transcription/splicing. The synthetic intron might have a bigger impact on more tightly regulated genes, so assessing the splicing rate would be essential if the authors wish to advocate their technique as generally applicable.

      • *

      We will assay for splicing efficiency as outlined below.

      The ability of the synthetic intron to be removed from final transcripts depends on functioning splicing machinery. The authors might emphasise this issue, as spliceosome mutations are important fields of study and might not be compatible with this method.

      • *

      We can add this in the text

      The authors used un-directed integration of each BAC under study. Therefore, it is hard to assess what effect the synthetic intron has, as the authors only ever assess the downstream levels of the correctly spliced, translated and localised protein. The authors themselves state that this can lead to clonal variations in expression of up to 2-fold and on line 250 that this variation "could compensate for synthetic intron effects", but make no effort to test this. Again, lines 267-268 highlight the potential dangers of potential effects of the synthetic introns, but do not test these. \Would additional experiments be essential to support the claims of the paper? Request additional experiments only where necessary for the paper as it is, and do not ask authors to open new lines of experimentation.* If not already performed, a large number of bacterial colonies should be screened for the correct modification and frequency of correct ones reported. This frequency - reported for at least three different modifications - would estimate what sort of efficiency this method provides. The modified region of each BAC should be sequenced and the results reported. The rate of exactly modified clones is important, in case of spontaneous or low fidelity integration of the antibiotic cassette. The percentage of transcripts that have the synthetic intron correctly spliced out should be measured for some of the BAC constructs used in the study. A direct head-to-head comparison of this newer method compared to other techniques, or even the authors' own previous two-step approach is necessary to assess the benefits of this method. Preferably, the experiment would be run in parallel with and without antibiotic selection applied, to show that it drastically improves chances of finding a correct clone. *

      We will generate 3 new mutations in BACs and analyze both the efficiency of integration by PCR and accuracy via sequencing. In practice, we have observed that the efficiency is similar to any other cassette integration, such as a GFP tag (Poser et al Nature Methods 2008) or a counterselection cassette (Bird et al Nature Methods 2012) (80-90%). Integrating a mutation via the second step of the counterselection method introduces a further 20% decrease in efficiencies on average.

      \Are the suggested experiments realistic in terms of time and resources? It would help if you could add an estimated cost and time investment for substantial experiments.* Repeating the transformation of the BAC and targeting cassette and assessing the recombination efficiency and sequencing should only require existing reagents and take less than a week or two to complete. Quantitative RT-PCR to assess the percentage of transcripts that have the synthetic intron spliced out would take a little more work. However, this should not be a considerable investment in time or resources for a standard microbiology laboratory and could be completed within a few weeks using modern techniques, such as that described in Londoño et al. 2016. Repeating all the experiments in parallel would be considerable work and would only be strictly necessary if the authors wish to emphasise the benefits of their method over the many others already in wide use. *

      • *

      We will use quantitative PCR to estimate the fraction of transcripts that correctly splice out the artificial intron for two clonal cell lines characterized in the study: RNAi-resistant AurA-GFP (Fig 4), and GTSE1-14A (newly introduced; see below). While the exact method described in Londoño et al 2016 will not be applicable due to the larger size of the artificial intron, we believe we can adapt it to detect different splicing events.

      \Are the data and the methods presented in such a way that they can be reproduced?* Barring the omission of Table S1, which presumably includes exact information on the BACs modified and sequences used etc., there is sufficient other data and methods to allow the experiments to be repeated. Targeting the ESI procedure to the middle of exons is likely to have a bigger impact for smaller exons as the authors mention on lines 99-100. Making it clear which exon sizes for each gene were successfully targeted in this study would help give some idea of how significant a problem this might be. Perhaps Table S1 contains this information, but it was not provided. It would also help reviewers check the design strategies. *

      We apologize for inadvertently failing to upload Table S1 on bioRxiv. It has been uploaded now as part of this submission process. This table indeed contains BAC and target sequence information, including the size of the targeted exon (and the 2 “new” resulting exons). Targeted exons range in size from 138bp to 1537bp, and “new” exons are as small as 48bp.

      \Are the experiments adequately replicated and statistical analysis adequate?* The replication and statistically analysis of the data as presented appear adequate. Figure Legends should state the statistic used to generate error bars. *

      This will be updated

      \*Minor comments:** Specific experimental issues that are easily addressable. Are the promoters used in the vectors described universally functional? For example, is the PGK promoter functional in yeast? *

      • *

      The PGK promoter contained in the cassettes is a mammalian promoter, which has also been reported to work in flies.

      \Are prior studies referenced appropriately?* The manuscript may benefit from the referencing of BAC modification techniques from a wider variety of groups, such as those using CRISPR-guided recombineering (Pyne et al. 2015). *

      We will add citations of more techniques

      \Are the text and figures clear and accurate?* The body text is very clear save minor typographical or grammatical errors. Regarding figures, some of the coloured text in Figure 1 is somewhat illegible when printed in grayscale. Line 278 - The acronyms LAP and NLAP are not defined/explained. Antibody section starting Line 282 may fit better next to Western Blot section. Figure 2C - The blot images would benefit from arrows to indicate expected sizes of proteins. Figure 3A - the graph may benefit from a dashed line at 100% to highlight that values are normalised to controls. Figure 4 - The differences between panels B & C are unclear. Figure 4E - The legend could provide a little more detail on cell cycle stage/status of the captured cells. *

      All of the above will be addressed accordingly

      \Do you have suggestions that would help the authors improve the presentation of their data and conclusions?* Lines 23-27 are somewhat unclear and feel out of context. Perhaps the authors could clarify this as a further advantage of using BACs instead of endogenous gene modifications. *

      Thanks for the input, we will clarify this.

      While not affecting the factual content of the paper, I would advocate that the authors format the method described in Figure S3 into a more detailed text based layout similar to that seen in a typical Nature Methods article. However, this may depend on the format required by any eventual publishing journal.

      • *

      We prefer the graphical protocol, but will discuss whether to add a text protocol with the journal editor.

      That all of the work the paper was carried out in human cell lines and using human genes is a further caveat, but the authors admit this in the discussion and one would assume that most mammalian cells would respond similarly in their ability to splice out the synthetic intron. Reviewer #1 (Significance (Required)): \Describe the nature and significance of the advance (e.g. conceptual, technical, clinical) for the field.* This work is a formal description of a newer method that could be useful for many of those employing bacterial artificial chromosomes in numerous studies, such as gene regulation. *Place the work in the context of the existing literature (provide references, where appropriate).* This work builds on methodology previously published by the authors - a counter-selection two-step procedure (Bird et al. 2011). It sets out to formally describe a method merely mentioned as "BAC intronization" in a later paper by some of the authors (Zheng et al. 2014). Other alternative one-step procedures are also available, but present a different set of challenges (Lyozin et al. 2014). Some newer approaches, such as those using CRISPR-guided recombineering (Pyne et al. 2015) or systems that combine CRISPR and positive/negative selection cassettes (Wang et al. 2016) may be slightly more efficient, but are also more complex in their design. Bird et al. 2011 DOI: 10/dv776q Pyne et al. 2015 DOI: 10/f7jx92 Wang et al. 2016 DOI: 10/f89db5 Zheng et al. 2014 DOI: 10/f5pkr6 *State what audience might be interested in and influenced by the reported findings.* As a technology paper this work should have interest from a broad field of research. While the use of BACs could sometimes be considered more traditional in light of the explosion in CRISPR-based genome editing capabilities, it is definitely seeing a resurgence as the limitations of CRISPR in modifying large regions of genome become more apparent. Therefore, technologies that accelerate the modification of BACs could prove increasingly useful. As category of audience, all those involved in significant recombineering or gene/genome engineering would potentially benefit. *Define your field of expertise with a few keywords to help the authors contextualize your point of view. Indicate if there are any parts of the paper that you do not have sufficient expertise to evaluate.* Synthetic genomics, synthetic biology, cancer cell biology, gene and genome engineering REFEREES CROSS COMMENTING I would agree with reviewer two's assessment that we both view the paper in a similar light. Reviewer #2 (Evidence, reproducibility and clarity (Required)): This is a methods-focused paper that presents a strategy to efficiently introduce mutations into a bacterial artificial transgene using synthetic introns. BAC-based methods have been an effective strategy for introducing trans genes into human cells to achieve near-endogenous expression, including extensive work from these authors. However, generating mutations and changes within the internal coding sequence presents some challenges for how to target these mutations and select for the mutated form. Here, the authors describe a way to overcome this by introducing synthetic introns into an adjacent sequence. This allows them to introduce a selectable marker and conduct the molecular biology without creating complications downstream for the functionality of the protein. This method is carefully described and presented. The authors also provide clear validation by using this to create RNAi-resistant versions of multiple different mitotic factors as well as creating targeted mutants that alter the functional properties of a protein. This work clearly takes advantage of other ongoing studies from these labs (including mutants and cell lines that appear to also have been described elsewhere), but the ability to combine these in a single paper and clearly describe the method provides a helpful advance and validation. Based on the description and data presented, I think that things are clear and carefully validated. As such, I do not have technical comments or concerns and I would be comfortable with this paper appearing in an appropriate journal in its present form. Reviewer #2 (Significance (Required)): This is a solid methods paper, but for considering the nature of the impact and significance of this paper, there are several things to note: 1.The BAC-based method does appear to be a powerful and effective strategy. However, beyond the work of Mitocheck and the authors that are part of this paper, this has not seen widespread adoption. It is possible that this current method may increase its usage due to the value of the targeted mutations within the coding sequence, but at present it is not a broadly used strategy. *

      We agree that using BACs as transgenes has not seen widespread adoption as a tool on the broader cell biology community (although certainly beyond members of the Mitocheck consortium). This is likely because many erroneously think that it is a technique for specialist laboratories. We are trying to change this! For reasons outlined below, there is still an increasing desire for conditional analysis of mutated genes under physiological expression/regulation frequently not attainable via directed Cas9-based mutation. A major aim of this paper is thus to further simplify the methods for generating modified BAC transgenes.

      2.This BAC-based approach (and also RNAi) are becoming increasingly replaced by the use of CRISPR/Cas9 genome editing. The absence of Cas9-based strategies in this paper limits the potential impact and reach of this paper. The authors do mention the possibility of using a similar synthetic intron strategy for use with Cas9 in the Discussion, and appear to have conducted some experiments. If possible, it would substantially increase the value of this paper if this data and strategy were also included in the Results section (acknowledging that this may still be a work in progress).

      While some uses of BAC transgenes are in some cases better replaced by CRISPR/Cas9 techniques (i.e. GFP tagging), there are several occasions where using BACs are preferable: As stated in the text, RNAi-resistant BACs allow for conditional analysis of recessive mutations. Mutations in essential genes that are lethal will prevent growth and recovery of viable cells if integrated into the genome via Cas9. Additionally, deleterious mutations are prone to accumulate suppressive changes in chromosome integrity or gene expression during the procedure of selecting and expanding Cas9-modified cells for analysis, particularly in the genomically instable cancer cell lines frequently employed.

      We use both BACs and CRISPR/Cas9 in our lab according to our needs.

      We do have an ongoing project to apply this intronization technique to enable more efficient selection of CRISPR/Cas9 integrations. Preliminary results suggest that it works to allow selection of point mutations, but it is still being optimized, including a redesign of the cassette, and is not ready for publication.

      3.The method is solid and well-validated, but there are no new results or insights presented in this paper from the work that is described (this is fine, just commenting for considering the right journal fit).

      As “biological insights” gained as a result of this technique we had cited a couple studies that made use of the technique already (to functionally analyze a microcephaly-associated mutation in the centriolar protein CPAP at the single cell level in HeLa cells and neural progenitor cells (Zheng et al 2014, Gabirel et al 2016)). As a response to this critique to include “new biology” in this paper, we will add new unpublished data investigating a specific question: Is the cell-cycle-regulated disruption of the EB1-GTSE1 (microtubule plus-end tracking proteins) interaction in mitosis required for chromosome segregation fidelity? We have generated a GTSE1 mutant with 14 phosphosites mutated to alanine using this technique. We will present the effect on chromosome segregation.

      REFEREES CROSS COMMENTING It appears that both reviewers are largely on the same page regarding this paper.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      This is a methods-focused paper that presents a strategy to efficiently introduce mutations into a bacterial artificial transgene using synthetic introns. BAC-based methods have been an effective strategy for introducing trans genes into human cells to achieve near-endogenous expression, including extensive work from these authors. However, generating mutations and changes within the internal coding sequence presents some challenges for how to target these mutations and select for the mutated form. Here, the authors describe a way to overcome this by introducing synthetic introns into an adjacent sequence. This allows them to introduce a selectable marker and conduct the molecular biology without creating complications downstream for the functionality of the protein.

      This method is carefully described and presented. The authors also provide clear validation by using this to create RNAi-resistant versions of multiple different mitotic factors as well as creating targeted mutants that alter the functional properties of a protein. This work clearly takes advantage of other ongoing studies from these labs (including mutants and cell lines that appear to also have been described elsewhere), but the ability to combine these in a single paper and clearly describe the method provides a helpful advance and validation.

      Based on the description and data presented, I think that things are clear and carefully validated. As such, I do not have technical comments or concerns and I would be comfortable with this paper appearing in an appropriate journal in its present form.

      Significance

      This is a solid methods paper, but for considering the nature of the impact and significance of this paper, there are several things to note:

      1.The BAC-based method does appear to be a powerful and effective strategy. However, beyond the work of Mitocheck and the authors that are part of this paper, this has not seen widespread adoption. It is possible that this current method may increase its usage due to the value of the targeted mutations within the coding sequence, but at present it is not a broadly used strategy.

      2.This BAC-based approach (and also RNAi) are becoming increasingly replaced by the use of CRISPR/Cas9 genome editing. The absence of Cas9-based strategies in this paper limits the potential impact and reach of this paper. The authors do mention the possibility of using a similar synthetic intron strategy for use with Cas9 in the Discussion, and appear to have conducted some experiments. If possible, it would substantially increase the value of this paper if this data and strategy were also included in the Results section (acknowledging that this may still be a work in progress).

      3.The method is solid and well-validated, but there are no new results or insights presented in this paper from the work that is described (this is fine, just commenting for considering the right journal fit).

      REFEREES CROSS COMMENTING

      It appears that both reviewers are largely on the same page regarding this paper.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary:

      Provide a short summary of the findings and key conclusions (including methodology and model system(s) where appropriate). The authors here describe a method to modify bacterial artificial chromosomes (BAC) harbouring gene loci from eukaryotes. When wanting to modify a BAC an antibiotic selection cassette is often included alongside the desired mutation/modification to increase the number of successful recombinants in E.coli. Traditionally, this is removed in a second recombination process to leave only the desired modification. The novelty in the procedure described herein is to add a synthetic intron consensus sequence around the selection cassette, which eliminates the need for the subsequent removal of the antibiotic cassette from the BAC before transfection into mammalian cells, saving time and resources. The technique is clever in its simplicity and appears to function for a number of gene loci. The authors validated the correct functioning of the modified BACs for a number of genes using three main assays - transcript level, protein level and localisation.

      Major comments:

      Are the key conclusions convincing?

      The conclusion that the method described generates functional modified BACs is valid.

      Should the authors qualify some of their claims as preliminary or speculative, or remove them altogether?

      While the method is successfully employed in this study, its efficiency is not quantified in relation to the state-of-the-art as described in the introduction. One assumes it would be more efficient, but this has not been tested empirically in the paper. Does the inclusion of the synthetic intron sequence have an effect on the efficiency of modifying BACs compared to a more typical two-step positive/negative antibiotic selection cassette? The functionality of this approach rests entirely on the ability of the target cell to correctly splice out the synthetic intron. The authors are aware of this potential problem as highlighted in the lines below, but do not make efforts to explicitly test splicing. On lines 224-225, the authors state "We cannot exclude that a small portion of synthetic introns within individual cells are misspliced". On lines 230-231 it is stated that "mis-spliced mRNAs are probably minimal and degraded by nonsense-mediated decay". On lines 215-217, the authors describe an "investigation of transgenic lines at the single-cell level" that suggests "the synthetic intron is correctly spliced out in all the cells of the population". How do the authors reach this conclusion? U2OS and HeLa cells are considered very "robust" and may not show detectable consequences when stressed with an increased level of nonsense-mediated decay. Further, many genes maintain a high level of expression that buffers them against small changes in transcription/splicing. The synthetic intron might have a bigger impact on more tightly regulated genes, so assessing the splicing rate would be essential if the authors wish to advocate their technique as generally applicable. The ability of the synthetic intron to be removed from final transcripts depends on functioning splicing machinery. The authors might emphasise this issue, as spliceosome mutations are important fields of study and might not be compatible with this method. The authors used un-directed integration of each BAC under study. Therefore, it is hard to assess what effect the synthetic intron has, as the authors only ever assess the downstream levels of the correctly spliced, translated and localised protein. The authors themselves state that this can lead to clonal variations in expression of up to 2-fold and on line 250 that this variation "could compensate for synthetic intron effects", but make no effort to test this. Again, lines 267-268 highlight the potential dangers of potential effects of the synthetic introns, but do not test these.

      Would additional experiments be essential to support the claims of the paper? Request additional experiments only where necessary for the paper as it is, and do not ask authors to open new lines of experimentation.

      If not already performed, a large number of bacterial colonies should be screened for the correct modification and frequency of correct ones reported. This frequency - reported for at least three different modifications - would estimate what sort of efficiency this method provides. The modified region of each BAC should be sequenced and the results reported. The rate of exactly modified clones is important, in case of spontaneous or low fidelity integration of the antibiotic cassette. The percentage of transcripts that have the synthetic intron correctly spliced out should be measured for some of the BAC constructs used in the study. A direct head-to-head comparison of this newer method compared to other techniques, or even the authors' own previous two-step approach is necessary to assess the benefits of this method. Preferably, the experiment would be run in parallel with and without antibiotic selection applied, to show that it drastically improves chances of finding a correct clone.

      Are the suggested experiments realistic in terms of time and resources? It would help if you could add an estimated cost and time investment for substantial experiments.

      Repeating the transformation of the BAC and targeting cassette and assessing the recombination efficiency and sequencing should only require existing reagents and take less than a week or two to complete. Quantitative RT-PCR to assess the percentage of transcripts that have the synthetic intron spliced out would take a little more work. However, this should not be a considerable investment in time or resources for a standard microbiology laboratory and could be completed within a few weeks using modern techniques, such as that described in Londoño et al. 2016. Repeating all the experiments in parallel would be considerable work and would only be strictly necessary if the authors wish to emphasise the benefits of their method over the many others already in wide use.

      Are the data and the methods presented in such a way that they can be reproduced?

      Barring the omission of Table S1, which presumably includes exact information on the BACs modified and sequences used etc., there is sufficient other data and methods to allow the experiments to be repeated. Targeting the ESI procedure to the middle of exons is likely to have a bigger impact for smaller exons as the authors mention on lines 99-100. Making it clear which exon sizes for each gene were successfully targeted in this study would help give some idea of how significant a problem this might be. Perhaps Table S1 contains this information, but it was not provided. It would also help reviewers check the design strategies.

      Are the experiments adequately replicated and statistical analysis adequate?

      The replication and statistically analysis of the data as presented appear adequate. Figure Legends should state the statistic used to generate error bars.

      Minor comments:

      Specific experimental issues that are easily addressable. Are the promoters used in the vectors described universally functional? For example, is the PGK promoter functional in yeast?

      Are prior studies referenced appropriately?

      The manuscript may benefit from the referencing of BAC modification techniques from a wider variety of groups, such as those using CRISPR-guided recombineering (Pyne et al. 2015).

      Are the text and figures clear and accurate?

      The body text is very clear save minor typographical or grammatical errors. Regarding figures, some of the coloured text in Figure 1 is somewhat illegible when printed in grayscale.

      Line 278 - The acronyms LAP and NLAP are not defined/explained.

      Antibody section starting Line 282 may fit better next to Western Blot section.

      Figure 2C - The blot images would benefit from arrows to indicate expected sizes of proteins.

      Figure 3A - the graph may benefit from a dashed line at 100% to highlight that values are normalised to controls.

      Figure 4 - The differences between panels B & C are unclear.

      Figure 4E - The legend could provide a little more detail on cell cycle stage/status of the captured cells.

      Do you have suggestions that would help the authors improve the presentation of their data and conclusions?

      Lines 23-27 are somewhat unclear and feel out of context. Perhaps the authors could clarify this as a further advantage of using BACs instead of endogenous gene modifications.

      While not affecting the factual content of the paper, I would advocate that the authors format the method described in Figure S3 into a more detailed text based layout similar to that seen in a typical Nature Methods article. However, this may depend on the format required by any eventual publishing journal. That all of the work the paper was carried out in human cell lines and using human genes is a further caveat, but the authors admit this in the discussion and one would assume that most mammalian cells would respond similarly in their ability to splice out the synthetic intron.

      Significance

      Describe the nature and significance of the advance (e.g. conceptual, technical, clinical) for the field.

      This work is a formal description of a newer method that could be useful for many of those employing bacterial artificial chromosomes in numerous studies, such as gene regulation.

      Place the work in the context of the existing literature (provide references, where appropriate).

      This work builds on methodology previously published by the authors - a counter-selection two-step procedure (Bird et al. 2011). It sets out to formally describe a method merely mentioned as "BAC intronization" in a later paper by some of the authors (Zheng et al. 2014). Other alternative one-step procedures are also available, but present a different set of challenges (Lyozin et al. 2014). Some newer approaches, such as those using CRISPR-guided recombineering (Pyne et al. 2015) or systems that combine CRISPR and positive/negative selection cassettes (Wang et al. 2016) may be slightly more efficient, but are also more complex in their design.

      Bird et al. 2011 DOI: 10/dv776q

      Pyne et al. 2015 DOI: 10/f7jx92

      Wang et al. 2016 DOI: 10/f89db5

      Zheng et al. 2014 DOI: 10/f5pkr6

      State what audience might be interested in and influenced by the reported findings.

      As a technology paper this work should have interest from a broad field of research. While the use of BACs could sometimes be considered more traditional in light of the explosion in CRISPR-based genome editing capabilities, it is definitely seeing a resurgence as the limitations of CRISPR in modifying large regions of genome become more apparent. Therefore, technologies that accelerate the modification of BACs could prove increasingly useful. As category of audience, all those involved in significant recombineering or gene/genome engineering would potentially benefit.

      Define your field of expertise with a few keywords to help the authors contextualize your point of view. Indicate if there are any parts of the paper that you do not have sufficient expertise to evaluate.

      Synthetic genomics, synthetic biology, cancer cell biology, gene and genome engineering

  5. Jun 2020
    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      We thank all reviewers for their comments and suggestions, which will make our manuscript a much better one. Accordingly, we have already made changes to the manuscript (marked in yellow) and we will perform all the experiments requested. Below, we answer the reviewers point by point.

      Reviewer #1 (Evidence, reproducibility and clarity (Required)): This study provides solid evidences showing a role for the spectraplakin Short-stop (Shot) in subcellular lumen formation in the Drosophila embryonic and larval trachea. This subcellular morphogenetic process relies on an inward membrane growth that depends on the proper organization of actin and microtubules (MTs) in terminal cells (TCs). Shot depletion leads to a defective or absent lumen while conversely, Shot overexpression promotes excessive branching, independently on the regulation of centrosome numbers previously shown to be important for the regulation of the lumen formation process (Ricolo, D., Deligiannaki, M., Casanova, J. & Araújo, S. J. Centrosome Amplification Increases Single-Cell Branching in Post-mitotic Cells. Current Biology 26, 2805-2813 (2016)). Shot is rather important to regulate the organization of the cytoskeleton by crosslinking MTs and actin. Shot expression in TCs is controlled by the Drosophila Serum Response Factor (DSRF) transcription factor. Finally Shot functionally overlaps with the MT-stabilizing protein Tau to promote lumen morphogenesis. The figures are clear and the questions well addressed with carefully designed and controlled experiments. However, I would have few suggestions that will hopefully make some points clearer. **Major comments:** -Statistical analyses should be added for comparisons of proportions, including Fig. 1E, 1L, Fig. 2G-I, Fig. 6L, Fig. 7K, Fig. 8C-D and Fig. 9G.

      We agree with this and have now redone all graphs and revised all quantifications from this study. We have added error bars in all above mentioned graphs and have provided statistical analysis where appropriate. We have also redone all graphics and phenotype reporting, which is done now in relation to total TCs (rather than embryos or GBs and DBs TCs). This was suggested also by reviewer #2 and we agree because this is a more stringent and comparable way of quantifying our results.

      -It is not always clear what genotype has been used as the "wt" genotype, as in Fig. S2 or Fig. 3 for example, this should be added to figure legends.

      We have now clarified which flies are used as controls in each experiment throughout the paper. We have left wt where flies were wt, and changed all other cases to either the genotype or “control”.

      -Live imaging of Shot has been performed with ShotC-GFP, that cannot bind actin. Don't the authors think ShotA-GFP would reflect more accurately Shot endogenous behavior as it interacts both with actin and MTs? It would be better to show this, even if the results shown here tend to be consistent with Shot endogenous localization shown with Shot antibody staining.

      We agree and we will analyse movies with both ShotC and ShotA and present them in the revised version.

      -It is of course not possible to generate CRISPR mutant flies with mutations in putative DSRF binding sites in a reasonable amount of time, to confirm that Shot transcription is controlled by DSRF. It would thus be nice to reveal shot mRNA expression with in situ hybridization experiments in wt vs. bs embryos. This would confirm that Shot mRNA is downregulated upon DSRF inhibition and rule out a possible indirect effect on Shot protein stability for example.

      We believe the presented 3-way approach (in silico, protein quantification and phenotype rescue) is sufficient to show that Shot expression is regulated by DSRF. It is unlikely that we are dealing with protein stability or other issues, because we can rescue the lumen elongation phenotype by solely expressing Shot in TCs. However, we agree it would be nice to show this in an in situ hybridization experiment, and we will try to provide a conclusive one for resubmission. In situ detection methods, however, may not be accurate enough to detect such differences in single-cells.

      -In the same figure, it would also be interesting to show what happens to actin and MTs in bs TCs and to which extent their organization is rescued by Shot overexpression.

      We are working on this for resubmission. These experiments were frozen by the current COVID-19 pandemic and this is why they were not submitted with the first version.

      -UAS-EB1GFP does not seem to be an appropriate control in Figure 9 (A and B) since it can affect MT dynamics (Vitre, B. et al. EB1 regulates microtubule dynamics and tubulin sheet closure in vitro. Nat. Cell Biol. 10, 415-421 (2008)). Why not simply use an UAS-GFP?

      We have not detected any notorious larval TC phenotypes by overexpressing UASEB1GFP in TCs. Their branching is comparable to that in previous studies (for example Schotenfeld-Roames, et al Current Biology 2014) and there were no detectable luminal branching phenotypes. However, we agree it is more correct to analyse cells with a plain GFP and have repeated the controls for this experiment using DSRFGAL4UASGFP. This is now shown in figure 9.

      -Shot and probably Tau crosslinking activities are important for lumen morphogenesis with a striking increase in the number of embryos without lumen in shot3 and shot3 tauMR22 mutant embryos. The rescue experiments clearly show that Shot binding to both MT and actin is essential for efficient rescue. The same might apply to Tau since it is able to crosslink actin and MTs (Elie, A. et al. Tau co-organizes dynamic microtubule and actin networks. Sci Rep 5, 1-10 (2015)). I believe showing actin and MTs organization in these rescue experiments would be necessary.

      We agree and we will provide these experiments upon resubmission.

      Second, the overexpression experiments indicate that Shot is able to induce extra lumen formation even when unable to bind actin as shown with the increase in the number of supernumerary lumina (ESLs) under overexpression of ShotC and ShotCtail to a lesser extent. This phenotype is also observed under Tau overexpression. This suggest that not crosslinking anymore but rather making MTs more stable could be sufficient to promote extra lumen formation in a wt context. Stabilising MTs by treatment with Taxol might thus be sufficient to promote ESL formation. I am fully aware of the difficulty of treating Drosophila embryos with drugs, making this experiment hard to do, but I think this dual function of Shot and Tau (crosslinking actin and MTs to promote branching vs. stabilizing MTs leading to excessive branching) should be discussed.

      In Figure 2 we show not just that UASShotC is able to induce ESl but also that UAS-ShotCtail containing only the MT binding domain of Shot is enough to induce ESLs in TCs, whereas UAS-deltaCtail is not. We agree Taxol treatment would be a nice experiment to do, however we also think we provide enough evidence that MT stability is enough for ESL whereas de novo lumen formation requires crosslinking of MTs to actin. As advised, we will discuss better both Shot and Tau dual function in ESL generation and de novo lumen formation for resubmission.

      **Minor comments:**

      We have already addressed most these minor comments in the manuscript (text revised and changes in yellow). And we provide answers to some of the comments below.

      -p2 line 1: 'acentrosomal luminal branching points' may be better than 'acentrosomal branching points' to describe the phenotype. -p4, line 16: the reference 23 is not properly inserted (should be after 'closure'). -p5, line 16: Please mention what the abbreviations Bnl and Btn stand for. -p5, line 20: these 80% of TCs cells with defects in subcellular lumen formation should appear on the graph in Fig. 1E (as shown in graph 1L).

      We have added shot RNAi results to graph E in figure 1.

      -p5, line 26: this 36% value does not seem to correspond to anything on the graph in Fig. 1N. According to the figure legend, 20% of TCs did not elongate at all and the lumen was completely absent (class IV), which is consistent with the result shown in Fig. 1L. Also, I am not sure why only 25 TCs were analysed in Fig. 1N while there are the data to analyse more as shown in Fig. 1E (400 TCs), this would make the graph more representative.

      Figure 1 N represents a detail of the different phenotypes present in shot mutant embryos. Whereas for most of the paper we consider only complete lack of TC lumen, here we show the different types of affected TCs and not just the ones with a complete lack of subcellular lumen. We apologise because it was not explained in the original manuscript that types III and IV are the “no lumen” class (they were subdivided into 2 classes because they have different cell enlongation phenotypes). 36% of the total of affected TCs displayed the lack of lumen phenotype (this means a 22,5 % of the total number of TCs, because total affected TCs are 62,5% only). Numbers are similar but not exactly the same because this analysis was done using confocal microscopy and cells analysed one by one in detail, which is not possible using colorimetric methods and only luminal markers. This is also the reason we only analysed 25 TCs in this case. We thank the reviewer for pointing this out and have better described it in the manuscript.

      -p6, line 8: ShotA-GFP is indeed a long isoform but is not the full-length Shot, as it does not contain the plakin repeat exon which would add another ~3000aa.

      We have corrected this.

      -p6, lines 21-23: ShotA-GFP localisation is not shown in FigS1. The authors should refer to Fig. 2. Enlarged areas/arrows might help the reader to better visualise the different localisations of ShotA-GFP and ShotC-GFP.

      We thank the reviewer for this request and we will change the figure providing enlarged areas upon resubmission. In this version of the manuscript we have already changed the error in figure referral in the text.

      -p7, line 23: Rca1 mutants should be better introduced here.

      We have added one sentence of introduction to the Rca1 phenotype.

      -p8, line 6: Shot colocalizes/associates with stable MTs and actin would be a more appropriate title for this paragraph.

      We thank the reviewer for this alternative, and we have changed this title in the manuscript.

      -p16, line 18: 'Shot is able to mediate crosstalk' would be better than 'Shot is able to crosstalk'. -p40, lines 6 and 7: L, M and N should be K', K' and K' respectively. -p41, Fig 10D: It is quite hard to see on the cartoon what the phenotype is for Shot OE.

      We will make this clearer for resubmission.

      -The following reference shows an important role for Shot in crosslinking actin and MTs during morphogenesis of the Drosophila embryo and should be cited in this manuscript (Booth, A. J. R., Blanchard, G. B., Adams, R. J. & Röper, K. A Dynamic Microtubule Cytoskeleton Directs Medial Actomyosin Function during Tube Formation. Developmental Cell 29, 562-576 (2014)).

      We thank the reviewer for pointing this out, because this is of course an important reference known to us, which we forgot to add. We have now added this to the manuscript.

      -FigS3. It would be good to add the labels on the figure (ShotC-GFP in green, and MoeRFP/lifeActinRFP in Magenta).

      We will do this for resubmission.

      Reviewer #1 (Significance (Required)): The findings shown in this manuscript shed an important light on the way subcellular morphogenesis occurs. It was known that both actin and MTs were required in this process, particularly during the formation of Drosophila trachea (JayaNandanan, N., Mathew, R. & Leptin, M. Guidance of subcellular tubulogenesis by actin under the control of a synaptotagmin-like protein and Moesin. Nature Communications 1-10 (2019). doi:10.1038/ncomms4036; Gervais, L. & Casanova, J. In Vivo Coupling of Cell Elongation and Lumen Formation in a Single Cell. Current Biology 20, 359-366 (2010)). This work provides additional molecular insights into the way branching morphogenesis from a single cell occurs in vivo, clearly demonstrating a requirement for actin-MT crosslinking mediated by Shot and Tau. This could be of great interest in the field of branching morphogenesis and lumen formation, not only in invertebrates but also in vertebrates where such a crosslinking might occur in the vasculature, the lung, the kidney or the mammary gland for example (Ochoa-Espinosa, A. & Affolter, M. Branching Morphogenesis: From Cells to Organs and Back. Cold Spring Harb Perspect Biol 4, a008243-a008243 (2012)). *Field of expertise:* morphogenesis, Drosophila, cytoskeleton, microtubules. Reviewer #2 (Evidence, reproducibility and clarity (Required)): **Summary:** The development of branched structures with intracellular lumen is widely observed in single cells of circulatory systems. However the molecular and cellular mechanisms of this complex morphogenesis are largely unknown. In previous study, the authors revealed that centrosome as a microtubule organizing center (MTOC) located at the apical junction contributes subcellular lumen formation in the terminal cells of Drosophila tracheal system. The microtubule bundles organized by MTOC are suggested to serve as trafficking mediators and structural stabilizers for the newly elongated lumen. In this manuscript, they focused on a Drosophila spectraplakin, Shot, which have been reported to crosslink MT minus-ends to actin network, in the subcellular lumen formation. The paper started by description of lumen elongation defect of the tracheal terminal cells in the shot[3] null mutant. The overexpression of full-length and series of truncated form of shot exhibited extra-subcellular lumina (ESL) in TCs, suggesting that Shot is required for the lumen formation in dose dependent manner. They next addressed whether Shot overexpression induces ESL through the supernumerary centrosomes as in Rca1 mutant, however the number of centrosomes was not affected. Moreover, the ESL were sprouted distally from the apical junction, suggesting that Shot operate in different way from the Rca1-dependent microtubule organization. To get mechanistic insight of Shot in the luminal formation, they checked localization of the Shot and found it localized with stable MTs around the nascent lumen and with the F-actin at the tip of the cell during the cell elongation and subcellular lumen formation. In shot[3] mutant, the MT-bundles were no longer localized to apical region and the actin accumulation at the tip of the cell was also reduced. The rescue experiments using several truncated forms of Shot, and well-designed genetic analysis using various shot mutants revealed that both MT binding domain and actin binding domains are needed to develop the lumen. The expression of shot was under the regulation by terminal cell-specific transcription factor bs/DSRF, and the overexpression of shot in bs LOF mutant suppressed its phenotype, indicated that part of the luminal phenotype of bs mutant in terminal cells are due to lower levels of the activity of shot. Finally, they checked whether Tau can compensate the function of shot in the subcellular lumen formation. The lumen elongation defect in shot mutant was suppressed by tau expression, and tau overexpression phenocopied the shot overexpression-induced ESL. Although tau mutant did not show the lumen formation defects, the double mutant of shot and tau exhibited synergistic effect. Shot was also required for subcellular luminal branching at larval stages. Overall, this work highlighted the importance of Shot as a crosslinker between MT and actin that acts in downstream of the FGF signaling-induced bs/DSRF expression for the subcellular lumen formation. An excess of Shot is sufficient for ESL formation from ectopic acentrosomal branching points. Furthermore, the Tau protein can functionally replace Shot in this context. **Major comments:** *- Are the key conclusions convincing?* *- Should the authors qualify some of their claims as preliminary or speculative, or remove them altogether?* The conclusions were basically supported by the set of data presented in this article, but following points need to be clarified. The truncated form ShotC lacks only half of calponin domain that are essential for the actin binding, thus it is still possible to bind actin to some extent. Although the actin binding activity is reported as "very weak" in the cited references, the quantitative analysis has not been done. Thus, the interpretation and claims based on the experiments using ShotC should be reviewed carefully.

      We agree with the reviewer and will revise all the text for resubmission in order to make this unambiguous. However, we would like to remark that our claims are not only based on UAS-ShotC but also in the shotkakP2 allele, which does not contain one of the calponin domains and in isoforms such UAS-Shot C-tail which do not have any ABD.

      Data set in some places seems fragmented. For example, overexpression study of shot constructs (Fig. 2) lacks phenotypic comparison of control (btl Gal4 driven control FP) to compare if phenotypes of shot constructs expression are different from control. Different methods of phenotypic quantification are employed. One was counting embryo number with at least one abnormality among 20 TCs of DB or GB, or the other counting every TC for the presence of lumen/branching conditions. The latter is more stringent measure and is more appropriate for the study of single cell morphogenesis.

      We totally agree with the reviewer. We have now revised all quantifications and graphs:

      1) We have used btl>GFP as control to all overexpression experiments in embryos and DSRFGAL4UASGFP in control larvae.

      2) We have made the paper uniform regarding quantifications, which are now all done in relation to total TCs and not embryos.

      For this reason, many of the graphs, figure legends and quantification values in the the manuscript text are now changed.

      *- Would additional experiments be essential to support the claims of the paper? Request additional experiments only where necessary for the paper as it is, and do not ask authors to open new lines of experimentation.* The all movies were using ShotC isoform which lacks half of the actin binding domain. The truncated isoform is not suitable to observe the localization, especially the colocalization with actin. The movies need to be retaken using full-length Shot at the dosage that does not interfere with normal TC development.

      We agree and we will analyse movies with both ShotC and ShotA for resubmission.

      Some statements on Moesin and Tau localization sound as if the authors studied Shot interaction with nascent Moe and Tau molecules. This is confusing because fragments of Moe and Tau, but not functional full length proteins, were used.

      We will revise the text to make this unambiguous fir resubmission.

      *- Are the suggested experiments realistic in terms of time and resources?* It would help if you could add an estimated cost and time investment for substantial experiments. Because the transgenic fly is already present, we assume it would be done in 4 weeks. However, it would be influnced under social circumstances whether the lab facilities are able to access or not. *- Are the data and the methods presented in such a way that they can be reproduced?* *- Are the experiments adequately replicated and statistical analysis adequate?* The methods provided seem to be sufficient for reproducing the data by competent researchers, and most of the data are solid and the sample numbers are sufficient for the claims. However, the criteria for phenotypic evaluation differs among graphs and figures, that possibly confuse the readers. Standardized measurement methods are desirable. **Minor comments:** *- Specific experimental issues that are easily addressable.* In the rescue experiments shown in Figure 6, only full-length Shot rescued the subcellular lumen formation, but either of truncated Shot did not. The localization study of MT and actin in those conditions will reveal whether proper localizations of actin and MT are critical for the lumen formation.

      We are working on this for resubmission. These experiments were stalled by the current COVID-19 pandemic and this is why they were not submitted with the first version. We will provide MT and actin localization for the rescue experiments with ShotA and ShotC.

      *- Are prior studies referenced appropriately?* The references are cited appropriately. *- Are the text and figures clear and accurate?* There are several typos: Remodelling -> remodeling, signalling -> signaling. In the figure 2, G and H seem redundant. Scale bars are missing in Fig1 F-K, Fig2 K-L, Fig6 A-I, Fig7 E-J and Fig8 E-J.

      We have changed the graphs in figure 2. Typos have been corrected. We will provide errors bars for resubmission.

      The author often called shot+ genotype as "wild type". They are transgenic strains with some mutations, and cannot be found in the wild. They should be simply called with genotype or "control" for experiments.

      We thank the reviewer for pointing these typos and incoherences with control genotypes. We have partly revise the text and figures and will finish for resubmission.

      *- Do you have suggestions that would help the authors improve the presentation of their data and conclusions?* In Figure 4, as the localization of Shot is difficult to see in detail, enlarged insets might help. In addition, the green and cyan in C'-E' is difficult to distinguish.

      We will change this for resubmission.

      With Figure 5, the authors claimed that Shot LOF leads to disorganized MT-bundles and actin localization. We feel this is an overstatement and the Figure should be backed up with better data, or removed. F-actin and microtubule localizations are highly dynamic and the snapshot pictures are insufficient for demonstrating defective localization. It is also possible that (potential) difference in the marker localization is due to indirect effect of Shot LOF in cell shape.

      We agree with the reviewer that fixed samples are not the best to analyse cytoskeletal components, but we observe clear differences in MT bundles and specially in actin localization in shot mutants as compared to controls and we believe it is important to show these results. Cell shape might of course alter the analysis which is why we present 3 different cell shapes in Figure 5. In addition, there are many previous studies where localization of MTs and actin was done in fixed mutant embryos, where cell shape is also affected, and revealed important steps in TC formation (Gervais and Casanova, 2010; JayanNadanan et al. 2014).Nonetheless, we have revised the text in order to avoid overstatements.

      Reviewer #2 (Significance (Required)): *- Describe the nature and significance of the advance (e.g. conceptual, technical, clinical) for the field.* *- Place the work in the context of the existing literature (provide references, where appropriate).* In blood capillary and insect trachea, the branching process of single vessel cells involves sprouting of cell protrusions, followed by the lumen extension from the main vessels. The lumen formation involves assembly of plasma membrane components inside of the cytoplasm. Since the luminal membrane is associated with protein complexes common to apical cell membrane, lumen formation is believed to involve redirection of apical trafficking of membranes to intracellular sites (Sigurbjörnsdóttir, Mathew, Leptin 2014, 10.1038/nrm3871). The authors previously demonstrated that centrosome is an important link of preexisting lumen to de novo lumen formation, leading to the hypothesis that centrosome-derived microtubules organize lumen membrane assembly. *- State what audience might be interested in and influenced by the reported findings.* In this manuscript, the authors addressed this issue by looking at the function of Shot/Plakin that has both microtubule and actin binding activities. Shot is an ideal candidate for linking actin-rich cell protrusions in the leading edge to centrosome- associated lumen tip. Indeed the authors clearly showed that shot is required for lumen extension and overexpressed shot protein associates with intracellular tract rich in microtubules and F-actin. Their findings are definitely a progress in the field of Drosophila tracheal development. Having said that, how Shot links leading edge protrusions and centrosomes, how it is organized into pre-lumen tract, and how it contribute to further assembly of luminal membrane and directed secretion, are not well understood yet. Without clues to those fundamental questions, I believe this paper is most appropriate for experts readers of Drosophila cell biology and tracheal development. Finally I feel that the paper include many data sets and some pictures are not easy to grasp essential points, such as three movies showing localization of overexpressed shot-C, RFP-moesin, and Lifeact. *- Define your field of expertise with a few keywords to help the authors contextualize your point of view.* Drosophila, tracheal cell biology. *- Indicate if there are any parts of the paper that you do not have sufficient expertise to evaluate.* No Reviewer #3 (Evidence, reproducibility and clarity (Required)): **Summary** In their manuscript entitled "Coordinated crosstalk between microtubules and actin by a spectraplakin regulates lumen formation and branching" Ricolo and Araujo characterize the requirement for Short Stop (Shot) in the formation of subcellular tubes in tracheal terminal cells. The authors examined embryos homozygous for shot3, a presumed null allele of shot. They found an 80% penetrant defect in seamless tube formation or growth. The phenotype resembles that reported for mutations in blistered, which encodes the Drosophila SRF ortholog. The authors find that expression of SRF is not blocked by mutations in shot and later find that bs mutants have decreased levels of shot expression and that shot overexpression can partly suppress the bs tube formation defects. The authors then examine whether the requirement for shot is autonomous to the trachea and find that it is, as pan-tracheal shot RNAi replicates the seamless tube defects. The authors find that overexpression of various Shot isoforms results in the formation of ectopic seamless tubes within terminal cells. Using the various transgenic constructs available for shot, the authors show that the overexpression phenotype is dependent upon the interaction between Shot and microtubules, and is dose-dependent. Previous work had shown that ectopic terminal cell tubes also can arise due to increased centrosome number; the authors show that centrosome number is not altered in shot mutants. Shot has well characterized actin and microtubule binding functions, and the authors show that Shot localization overlaps both with microtubules and with actin, and that both cytoskeletal elements are aberrant in shot mutant cells. In a series of experiments utilizing various shot mutant backgrounds and shot transgenes, the authors identify requirements for both Shot-cytoskeleton interactions in the formation and branching of seamless tubes in terminal cells. Finally, the authors examine the requirement for Tau in the same processes. Tau and Shot had previously been found to work together in neurons, and this seems to be true in terminal cells as well. Tau overexpression induces ectopic seamless tubes and can partially suppress shot loss of function. Embryos mutant for tau showed seamless tube directionality defects, but not lumen formation or branching. Embryos doubly mutant for tau and shot showed a more severe seamless tube defect than shot mutants alone - an increase in terminal cells with no lumen from 22% to 85%. Authors also examined terminal cells in larval stages using dsrf-Gal4 to knockdown shot in terminal cells (rather than pan-tracheal knockdown with breathless). The authors conclude from their studies that Shot, through its interactions with microtubules and the actin cytoskeleton coordinate the outgrowth and branching of subcellular tubes. Overlapping function of Tau and possibly other additional MAPs also act in these processes. The work is largely well done and the conclusions are supported by the data. **Minor concerns:** -If one were to start this work today, crispr knockout and knockins would be preferred. While shot^3 is widely considered a null allele, there are indications that some shot function is still present in shot^3 embryos. This would also be relevant to the penetrance of the defects. The transgenes are useful, but given the dosage effects noted in various of the authors experiments, interpretation of some experiments is complicated as compared to a knockin. For overexpression experiments, landing site constructs would be preferable. I do not mean to suggest that the authors necessarily go this route, but am just pointing out a limitation of the approach.

      We agree, but we also think that with the amount of data and tools generated by other labs over recent years, regarding shot function in the nervous system (Voelzmann et al 2017), we are in a position to be able to take the conclusions of this work based on these transgenic and different shot alleles.

      -Insight into function at higher resolution than altered microtubule and actin organization would significantly increase the impact. -cell autonomy (line 19, p5) is not the correct term. Pan-tracheal knockdown tests tissue autonomy. Mosaic analysis or terminal cell specific knockdown would address cell autonomy.

      We have changed the manuscript accordingly.

      -line 14 p6 acting should be actin -dsrf-Gal4 transgenes were made by Mark Metzstein

      We have corrected these.

      -there also appears to be rescue of the fusion cell defects of shot by Tau overexpression. Authors should comment on this and what it means for the seamless tubulogenesis program in terminal cells vs fusion cells.

      We will reanalyse shot rescued with tau embryos focusing on fusion phenotypes and discuss this in the revised version.

      Reviewer #3 (Significance (Required)): The findings will be of interest to a broad cell biology community as they provide a conceptual advance and may help to focus future work on seamless tubulogenesis. The authors do a good job of placing the results in the context of previous studies. *Field of expertise:* Drosophila, tracheal tubulogenesis, developmental biology

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Summary

      In their manuscript entitled "Coordinated crosstalk between microtubules and actin by a spectraplakin regulates lumen formation and branching" Ricolo and Araujo characterize the requirement for Short Stop (Shot) in the formation of subcellular tubes in tracheal terminal cells.

      The authors examined embryos homozygous for shot3, a presumed null allele of shot. They found an 80% penetrant defect in seamless tube formation or growth. The phenotype resembles that reported for mutations in blistered, which encodes the Drosophila SRF ortholog. The authors find that expression of SRF is not blocked by mutations in shot and later find that bs mutants have decreased levels of shot expression and that shot overexpression can partly suppress the bs tube formation defects.

      The authors then examine whether the requirement for shot is autonomous to the trachea and find that it is, as pan-tracheal shot RNAi replicates the seamless tube defects.

      The authors find that overexpression of various Shot isoforms results in the formation of ectopic seamless tubes within terminal cells. Using the various transgenic constructs available for shot, the authors show that the overexpression phenotype is dependent upon the interaction between Shot and microtubules, and is dose-dependent.

      Previous work had shown that ectopic terminal cell tubes also can arise due to increased centrosome number; the authors show that centrosome number is not altered in shot mutants.

      Shot has well characterized actin and microtubule binding functions, and the authors show that Shot localization overlaps both with microtubules and with actin, and that both cytoskeletal elements are aberrant in shot mutant cells. In a series of experiments utilizing various shot mutant backgrounds and shot transgenes, the authors identify requirements for both Shot-cytoskeleton interactions in the formation and branching of seamless tubes in terminal cells.

      Finally, the authors examine the requirement for Tau in the same processes. Tau and Shot had previously been found to work together in neurons, and this seems to be true in terminal cells as well. Tau overexpression induces ectopic seamless tubes and can partially suppress shot loss of function. Embryos mutant for tau showed seamless tube directionality defects, but not lumen formation or branching. Embryos doubly mutant for tau and shot showed a more severe seamless tube defect than shot mutants alone - an increase in terminal cells with no lumen from 22% to 85%.

      Authors also examined terminal cells in larval stages using dsrf-Gal4 to knockdown shot in terminal cells (rather than pan-tracheal knockdown with breathless).

      The authors conclude from their studies that Shot, through its interactions with microtubules and the actin cytoskeleton coordinate the outgrowth and branching of subcellular tubes. Overlapping function of Tau and possibly other additional MAPs also act in these processes.

      The work is largely well done and the conclusions are supported by the data.

      Minor concerns:

      -If one were to start this work today, crispr knockout and knockins would be preferred. While shot^3 is widely considered a null allele, there are indications that some shot function is still present in shot^3 embryos. This would also be relevant to the penetrance of the defects. The transgenes are useful, but given the dosage effects noted in various of the authors experiments, interpretation of some experiments is complicated as compared to a knockin. For overexpression experiments, landing site constructs would be preferable. I do not mean to suggest that the authors necessarily go this route, but am just pointing out a limitation of the approach.

      -Insight into function at higher resolution than altered microtubule and actin organization would significantly increase the impact.

      -cell autonomy (line 19, p5) is not the correct term. Pan-tracheal knockdown tests tissue autonomy. Mosaic analysis or terminal cell specific knockdown would address cell autonomy.

      -line 14 p6 acting should be actin

      -dsrf-Gal4 transgenes were made by Mark Metzstein

      -there also appears to be rescue of the fusion cell defects of shot by Tau overexpression. Authors should comment on this and what it means for the seamless tubulogenesis program in terminal cells vs fusion cells.

      Significance

      The findings will be of interest to a broad cell biology community as they provide a conceptual advance and may help to focus future work on seamless tubulogenesis. The authors do a good job of placing the results in the context of previous studies.

      Field of expertise: Drosophila, tracheal tubulogenesis, developmental biology

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary:

      The development of branched structures with intracellular lumen is widely observed in single cells of circulatory systems. However the molecular and cellular mechanisms of this complex morphogenesis are largely unknown. In previous study, the authors revealed that centrosome as a microtubule organizing center (MTOC) located at the apical junction contributes subcellular lumen formation in the terminal cells of Drosophila tracheal system. The microtubule bundles organized by MTOC are suggested to serve as trafficking mediators and structural stabilizers for the newly elongated lumen.

      In this manuscript, they focused on a Drosophila spectraplakin, Shot, which have been reported to crosslink MT minus-ends to actin network, in the subcellular lumen formation. The paper started by description of lumen elongation defect of the tracheal terminal cells in the shot[3] null mutant. The overexpression of full-length and series of truncated form of shot exhibited extra-subcellular lumina (ESL) in TCs, suggesting that Shot is required for the lumen formation in dose dependent manner. They next addressed whether Shot overexpression induces ESL through the supernumerary centrosomes as in Rca1 mutant, however the number of centrosomes was not affected. Moreover, the ESL were sprouted distally from the apical junction, suggesting that Shot operate in different way from the Rca1-dependent microtubule organization. To get mechanistic insight of Shot in the luminal formation, they checked localization of the Shot and found it localized with stable MTs around the nascent lumen and with the F-actin at the tip of the cell during the cell elongation and subcellular lumen formation. In shot[3] mutant, the MT-bundles were no longer localized to apical region and the actin accumulation at the tip of the cell was also reduced. The rescue experiments using several truncated forms of Shot, and well-designed genetic analysis using various shot mutants revealed that both MT binding domain and actin binding domains are needed to develop the lumen. The expression of shot was under the regulation by terminal cell-specific transcription factor bs/DSRF, and the overexpression of shot in bs LOF mutant suppressed its phenotype, indicated that part of the luminal phenotype of bs mutant in terminal cells are due to lower levels of the activity of shot. Finally, they checked whether Tau can compensate the function of shot in the subcellular lumen formation. The lumen elongation defect in shot mutant was suppressed by tau expression, and tau overexpression phenocopied the shot overexpression-induced ESL. Although tau mutant did not show the lumen formation defects, the double mutant of shot and tau exhibited synergistic effect. Shot was also required for subcellular luminal branching at larval stages.

      Overall, this work highlighted the importance of Shot as a crosslinker between MT and actin that acts in downstream of the FGF signaling-induced bs/DSRF expression for the subcellular lumen formation. An excess of Shot is sufficient for ESL formation from ectopic acentrosomal branching points. Furthermore, the Tau protein can functionally replace Shot in this context.

      Major comments:

      - Are the key conclusions convincing? - Should the authors qualify some of their claims as preliminary or speculative, or remove them altogether?

      The conclusions were basically supported by the set of data presented in this article, but following points need to be clarified.

      The truncated form ShotC lacks only half of calponin domain that are essential for the actin binding, thus it is still possible to bind actin to some extent. Although the actin binding activity is reported as "very weak" in the cited references, the quantitative analysis has not been done. Thus, the interpretation and claims based on the experiments using ShotC should be reviewed carefully.

      Data set in some places seems fragmented. For example, overexpression study of shot constructs (Fig. 2) lacks phenotypic comparison of control (btl Gal4 driven control FP) to compare if phenotypes of shot constructs expression are different from control. Different methods of phenotypic quantification are employed. One was counting embryo number with at least one abnormality among 20 TCs of DB or GB, or the other counting every TC for the presence of lumen/branching conditions. The latter is more stringent measure and is more appropriate for the study of single cell morphogenesis.

      - Would additional experiments be essential to support the claims of the paper? Request additional experiments only where necessary for the paper as it is, and do not ask authors to open new lines of experimentation.

      The all movies were using ShotC isoform which lacks half of the actin binding domain. The truncated isoform is not suitable to observe the localization, especially the colocalization with actin. The movies need to be retaken using full-length Shot at the dosage that does not interfere with normal TC development.

      Some statements on Moesin and Tau localization sound as if the authors studied Shot interaction with nascent Moe and Tau molecules. This is confusing because fragments of Moe and Tau, but not functional full length proteins, were used.

      - Are the suggested experiments realistic in terms of time and resources? It would help if you could add an estimated cost and time investment for substantial experiments.

      Because the transgenic fly is already present, we assume it would be done in 4 weeks. However, it would be influnced under social circumstances whether the lab facilities are able to access or not.

      - Are the data and the methods presented in such a way that they can be reproduced? - Are the experiments adequately replicated and statistical analysis adequate?

      The methods provided seem to be sufficient for reproducing the data by competent researchers, and most of the data are solid and the sample numbers are sufficient for the claims. However, the criteria for phenotypic evaluation differs among graphs and figures, that possibly confuse the readers. Standardized measurement methods are desirable.

      Minor comments:

      - Specific experimental issues that are easily addressable.

      In the rescue experiments shown in Figure 6, only full-length Shot rescued the subcellular lumen formation, but either of truncated Shot did not. The localization study of MT and actin in those conditions will reveal whether proper localizations of actin and MT are critical for the lumen formation.

      - Are prior studies referenced appropriately?

      The references are cited appropriately.

      - Are the text and figures clear and accurate?

      There are several typos: Remodelling -> remodeling, signalling -> signaling. In the figure 2, G and H seem redundant. Scale bars are missing in Fig1 F-K, Fig2 K-L, Fig6 A-I, Fig7 E-J and Fig8 E-J.

      The author often called shot+ genotype as "wild type". They are transgenic strains with some mutations, and cannot be found in the wild. They should be simply called with genotype or "control" for experiments.

      - Do you have suggestions that would help the authors improve the presentation of their data and conclusions?

      In Figure 4, as the localization of Shot is difficult to see in detail, enlarged insets might help. In addition, the green and cyan in C'-E' is difficult to distinguish.

      With Figure 5, the authors claimed that Shot LOF leads to disorganized MT-bundles and actin localization. We feel this is an overstatement and the Figure should be backed up with better data, or removed. F-actin and microtubule localizations are highly dynamic and the snapshot pictures are insufficient for demonstrating defective localization. It is also possible that (potential) difference in the marker localization is due to indirect effect of Shot LOF in cell shape.

      Significance

      - Describe the nature and significance of the advance (e.g. conceptual, technical, clinical) for the field.

      - Place the work in the context of the existing literature (provide references, where appropriate).

      In blood capillary and insect trachea, the branching process of single vessel cells involves sprouting of cell protrusions, followed by the lumen extension from the main vessels. The lumen formation involves assembly of plasma membrane components inside of the cytoplasm. Since the luminal membrane is associated with protein complexes common to apical cell membrane, lumen formation is believed to involve redirection of apical trafficking of membranes to intracellular sites (Sigurbjörnsdóttir, Mathew, Leptin 2014, 10.1038/nrm3871). The authors previously demonstrated that centrosome is an important link of preexisting lumen to de novo lumen formation, leading to the hypothesis that centrosome-derived microtubules organize lumen membrane assembly.

      - State what audience might be interested in and influenced by the reported findings.

      In this manuscript, the authors addressed this issue by looking at the function of Shot/Plakin that has both microtubule and actin binding activities. Shot is an ideal candidate for linking actin-rich cell protrusions in the leading edge to centrosome- associated lumen tip. Indeed the authors clearly showed that shot is required for lumen extension and overexpressed shot protein associates with intracellular tract rich in microtubules and F-actin. Their findings are definitely a progress in the field of Drosophila tracheal development. Having said that, how Shot links leading edge protrusions and centrosomes, how it is organized into pre-lumen tract, and how it contribute to further assembly of luminal membrane and directed secretion, are not well understood yet. Without clues to those fundamental questions, I believe this paper is most appropriate for experts readers of Drosophila cell biology and tracheal development.

      Finally I feel that the paper include many data sets and some pictures are not easy to grasp essential points, such as three movies showing localization of overexpressed shot-C, RFP-moesin, and Lifeact.

      - Define your field of expertise with a few keywords to help the authors contextualize your point of view.

      Drosophila, tracheal cell biology.

      - Indicate if there are any parts of the paper that you do not have sufficient expertise to evaluate.

      No

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      This study provides solid evidences showing a role for the spectraplakin Short-stop (Shot) in subcellular lumen formation in the Drosophila embryonic and larval trachea. This subcellular morphogenetic process relies on an inward membrane growth that depends on the proper organization of actin and microtubules (MTs) in terminal cells (TCs). Shot depletion leads to a defective or absent lumen while conversely, Shot overexpression promotes excessive branching, independently on the regulation of centrosome numbers previously shown to be important for the regulation of the lumen formation process (Ricolo, D., Deligiannaki, M., Casanova, J. & Araújo, S. J. Centrosome Amplification Increases Single-Cell Branching in Post-mitotic Cells. Current Biology 26, 2805-2813 (2016)). Shot is rather important to regulate the organization of the cytoskeleton by crosslinking MTs and actin. Shot expression in TCs is controlled by the Drosophila Serum Response Factor (DSRF) transcription factor. Finally Shot functionally overlaps with the MT-stabilizing protein Tau to promote lumen morphogenesis.

      The figures are clear and the questions well addressed with carefully designed and controlled experiments. However, I would have few suggestions that will hopefully make some points clearer.

      Major comments:

      -Statistical analyses should be added for comparisons of proportions, including Fig. 1E, 1L, Fig. 2G-I, Fig. 6L, Fig. 7K, Fig. 8C-D and Fig. 9G.

      -It is not always clear what genotype has been used as the "wt" genotype, as in Fig. S2 or Fig. 3 for example, this should be added to figure legends.

      -Live imaging of Shot has been performed with ShotC-GFP, that cannot bind actin. Don't the authors think ShotA-GFP would reflect more accurately Shot endogenous behavior as it interacts both with actin and MTs? It would be better to show this, even if the results shown here tend to be consistent with Shot endogenous localization shown with Shot antibody staining.

      -It is of course not possible to generate CRISPR mutant flies with mutations in putative DSRF binding sites in a reasonable amount of time, to confirm that Shot transcription is controlled by DSRF. It would thus be nice to reveal shot mRNA expression with in situ hybridization experiments in wt vs. bs embryos. This would confirm that Shot mRNA is downregulated upon DSRF inhibition and rule out a possible indirect effect on Shot protein stability for example.

      -In the same figure, it would also be interesting to show what happens to actin and MTs in bs TCs and to which extent their organization is rescued by Shot overexpression.

      -UAS-EB1GFP does not seem to be an appropriate control in Figure 9 (A and B) since it can affect MT dynamics (Vitre, B. et al. EB1 regulates microtubule dynamics and tubulin sheet closure in vitro. Nat. Cell Biol. 10, 415-421 (2008)). Why not simply use an UAS-GFP?

      -Shot and probably Tau crosslinking activities are important for lumen morphogenesis with a striking increase in the number of embryos without lumen in shot3 and shot3 tauMR22 mutant embryos. The rescue experiments clearly show that Shot binding to both MT and actin is essential for efficient rescue. The same might apply to Tau since it is able to crosslink actin and MTs (Elie, A. et al. Tau co-organizes dynamic microtubule and actin networks. Sci Rep 5, 1-10 (2015)). I believe showing actin and MTs organization in these rescue experiments would be necessary.

      Second, the overexpression experiments indicate that Shot is able to induce extra lumen formation even when unable to bind actin as shown with the increase in the number of supernumerary lumina (ESLs) under overexpression of ShotC and ShotCtail to a lesser extent. This phenotype is also observed under Tau overexpression. This suggest that not crosslinking anymore but rather making MTs more stable could be sufficient to promote extra lumen formation in a wt context. Stabilising MTs by treatment with Taxol might thus be sufficient to promote ESL formation. I am fully aware of the difficulty of treating Drosophila embryos with drugs, making this experiment hard to do, but I think this dual function of Shot and Tau (crosslinking actin and MTs to promote branching vs. stabilizing MTs leading to excessive branching) should be discussed.

      Minor comments:

      -p2 line 1: 'acentrosomal luminal branching points' may be better than 'acentrosomal branching points' to describe the phenotype.

      -p4, line 16: the reference 23 is not properly inserted (should be after 'closure').

      -p5, line 16: Please mention what the abbreviations Bnl and Btn stand for.

      -p5, line 20: these 80% of TCs cells with defects in subcellular lumen formation should appear on the graph in Fig. 1E (as shown in graph 1L).

      -p5, line 26: this 36% value does not seem to correspond to anything on the graph in Fig. 1N. According to the figure legend, 20% of TCs did not elongate at all and the lumen was completely absent (class IV), which is consistent with the result shown in Fig. 1L.

      Also, I am not sure why only 25 TCs were analysed in Fig. 1N while there are the data to analyse more as shown in Fig. 1E (400 TCs), this would make the graph more representative.

      -p6, line 8: ShotA-GFP is indeed a long isoform but is not the full-length Shot, as it does not contain the plakin repeat exon which would add another ~3000aa.

      -p6, lines 21-23: ShotA-GFP localisation is not shown in FigS1. The authors should refer to Fig. 2. Enlarged areas/arrows might help the reader to better visualise the different localisations of ShotA-GFP and ShotC-GFP.

      -p7, line 23: Rca1 mutants should be better introduced here.

      -p8, line 6: Shot colocalizes/associates with stable MTs and actin would be a more appropriate title for this paragraph.

      -p16, line 18: 'Shot is able to mediate crosstalk' would be better than 'Shot is able to crosstalk'.

      -p40, lines 6 and 7: L, M and N should be K', K' and K' respectively.

      -p41, Fig 10D: It is quite hard to see on the cartoon what the phenotype is for Shot OE.

      -The following reference shows an important role for Shot in crosslinking actin and MTs during morphogenesis of the Drosophila embryo and should be cited in this manuscript (Booth, A. J. R., Blanchard, G. B., Adams, R. J. & Röper, K. A Dynamic Microtubule Cytoskeleton Directs Medial Actomyosin Function during Tube Formation. Developmental Cell 29, 562-576 (2014)).

      -FigS3. It would be good to add the labels on the figure (ShotC-GFP in green, and MoeRFP/lifeActinRFP in Magenta).

      Significance

      The findings shown in this manuscript shed an important light on the way subcellular morphogenesis occurs. It was known that both actin and MTs were required in this process, particularly during the formation of Drosophila trachea (JayaNandanan, N., Mathew, R. & Leptin, M. Guidance of subcellular tubulogenesis by actin under the control of a synaptotagmin-like protein and Moesin. Nature Communications 1-10 (2019). doi:10.1038/ncomms4036; Gervais, L. & Casanova, J. In Vivo Coupling of Cell Elongation and Lumen Formation in a Single Cell. Current Biology 20, 359-366 (2010)). This work provides additional molecular insights into the way branching morphogenesis from a single cell occurs in vivo, clearly demonstrating a requirement for actin-MT crosslinking mediated by Shot and Tau.

      This could be of great interest in the field of branching morphogenesis and lumen formation, not only in invertebrates but also in vertebrates where such a crosslinking might occur in the vasculature, the lung, the kidney or the mammary gland for example (Ochoa-Espinosa, A. & Affolter, M. Branching Morphogenesis: From Cells to Organs and Back. Cold Spring Harb Perspect Biol 4, a008243-a008243 (2012)).

      Field of expertise: morphogenesis, Drosophila, cytoskeleton, microtubules.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      The response to reviewers consists of three parts:

      1. A summary of the main points from the two reviews, and the authors' response to these points.
      2. A detailed revision plan for the preprint, taking into account both the main points of the reviews, and other comments made by the reviewers.
      3. A point-by-point response to the reviewers.

      For figure citations, OV = old version, i.e. bioRxiv preprint 2019-826180v2, and NV = new version, i.e. revised and re-submitted version.

      1. Summary of main points by the reviewers, and authors’ responses:

      • Both reviewers felt that the manuscript was overlong; Reviewer 1 recommended either shortening it or splitting it into two stories, while Reviewer 2 recommended cutting down the text.
        • We have considerably shortened the manuscript in accordance with this request (see revision plan below). We had already considered splitting the manuscript into two parts during the drafting stage, and had rejected this possibility as the data are intertwined - the retroactive validation of the dimer interface by the mutagenesis constructs (OV Fig. S3 [NV Fig. S4]) being a good example.
        • The revised manuscript features 7 main figures and 13 supplementals.
      • Both reviewers felt too much text and figure space was allocated to negative data, specifically the investigation of potential lipid binding by the TbMORN1 protein, and that there should be more focus on the positive parts of the story.
        • A key part of shortening the manuscript has been moving most of the negative data on lipid binding into the supplemental figures, and considerably shortening the associated text. This has allowed the main figures and associated text to focus more on the positive elements of the project, while still ensuring publication of all the data.
      • The reviewers appear to be in slight disagreement concerning discussion of the data. Reviewer 1 has encouraged more speculation on the physiological role of PE binding, a potential lipid transfer function, a role for calcium ions, the relevance of the observed disulphide bond, and the role of zinc ions in apicomplexan proteins; Reviewer 2 has recommended avoiding excessive speculation or inference.
        • Given that both reviewers have agreed that the original manuscript was overlong, we have implemented Reviewer 2's suggestion here and reduced the amount of speculation in the revised text.
      • The reviewers agreed that the technical quality of the data was high and that the conclusions drawn were robust.
        • We are glad that the reviewers were appreciative of the data quality. For this reason, we were reluctant to remove any of the data from the manuscript and would prefer instead to transfer it to the supplementals. We feel that the negative data still have considerable community value, given that they show that MORN repeats are not automatically lipid binding modules and can thus act as a caveat to other researchers.

      2. Detailed revision plan for the preprint:

      • We have implemented the reviewers' suggestions and substantially shortened the manuscript, primarily by trimming the (phospho)lipid-binding section, which contains a large amount of negative data. The following main figures have been moved into the supplemental section:
        • OV Fig. 2 ("TbMORN1 interacts with phospholipids but not liposomes") has become NV Fig. S2
        • OV Fig. 4 ("TbMORN1(2-15) does not bind to liposomes in vitro") has become NV Fig. S6
        • OV Fig. 8 ("Conservation and properties of residues in TbMORN1(7- 15)") has become NV Fig. S11
      • This has left a total of 7 main figures and 13 supplementals.
      • The text associated with the entirety of the lipid-binding part (OV lines 210- 530, OV Figs. 2-6 [NV Figs. 2-4, S2, S6], OV Supplemental Figs. 2-6 [NV Supplemental Figs. S3-S5, S7, S8]) has been condensed. The focus of this section is now on the positive parts of the data: the PE association (OV Fig. 3 [NV Fig. 2]) and the in vivo work (OV Figs. 5, 6 [NV Figs. 3, 4]).
      • We have additionally limited the amount of inference and speculation in the manuscript.

      3. Point-by-point responses to the reviewers

      Reviewer #1 (Evidence, reproducibility and clarity):

      MORN (membrane occupation and recognition nexus) repeat proteins are found in prokaryotes and eukaryotes. They feature characteristic repeats in their primary sequence, have been assumed to play a role in lipid binding, but remain poorly characterized on the functional and structural level. This manuscript tries to address both these questions and is organized in major parts. In the first part the authors characterize a putative role of MORN repeat proteins in lipid binding and membrane association. In the second part, the authors use X-ray crystallography to establish the structure of MORN repeat proteins and to investigate the dimerization.

      As a cleverly chosen point of departure, they focus their study particularly on MORN1 from Trypanosoma brucei (TbMORN1), which is composed solely on MORN repeats. The structures of MORN repeats (from several species) in part two provide interesting insights into their mode of homotypic interactions and their role as dimerization or oligomerization devices. The lipid binding and membrane association of MORN proteins in the first part remains somewhat confusing and unclear, despite the use of a whole battery of techniques.

      We anticipate that the shortening and refocusing of the lipid binding data has addressed this issue.

      It is questionably, why the authors invest so many figures and words to inform the reader on negative results.

      We have chosen to publicise our negative data in full because, as noted in the manuscript, there is a widespread and erroneous assumption that MORN repeats are lipid binding modules. We feel that publishing these data will allow them to act as a caveat to other researchers working on MORN repeat proteins. We have, however, addressed the reviewer's request in that we have considerably shortened the text associated with these data and have moved the corresponding figures into the supplementals.

      The authors suggest that MORN proteins can bind to lipids via their hydrophobic acyl chainswhich is 'very hard to imagine under physiological conditions unless TbMORN1 is a lipid carrier and not a membrane-binding proteins. Unfortunately, a role as lipid carrier has not been rigorously tested.

      The reviewer is correct that we have not specifically tested for a function as a lipid carrier protein and although this was only speculation, it has been toned down accordingly.

      In this sense the first part remains somewhat immature and incoherent. Furthermore, they suggest based on the lack-of-evidence that MORN proteins do not bind membranes in vivo and in vitro.

      We are not clear where this suggestion was made. Our data indicate that TbMORN1 does not directly bind membranes in vivo or in vitro, and we therefore noted that putative lipid binding by other MORN repeat proteins should be viewed with caution. Specifically, we stated in the Discussion (OV lines 955-956) that "the presence of MORN repeats in a protein should not be taken as indicative of lipid binding or lipid membrane binding without experimental evidence". Again, our expectation is that the major changes planned for the data presentation in this section will make it more coherent.

      The main issue of this manuscript is, in my view, the way the data were presented.The manuscript is generally well-written, but much too long. The structural work is important and concise.

      We have considerably shortened the manuscript as per the reviewer's request, and especially the section on lipid binding.

      The first part, however, reports in five separate figures on a lack of membrane binding by a MORN protein and its ability to bind individual lipids. The physiologically relevance of this lipid binding is questionable as acknowledged by the authors.

      We have moved two of these figures (OV Figs. 2, 4) into the supplementals section [NV Figs. S2, S6], shortened the associated text, and limited the amount of speculation.

      Even though I find it important that the membrane/lipid binding ability of MORN proteins is rigorously tested, I would highly recommend to separate the current manuscript in two independent stories. Alternatively, I would recommend to reduce the first part into a single figure and to remove the most artifactual assays.

      We have implemented the second of these two suggestions for the manuscript. We had already considered splitting the manuscript during the drafting stage, but rejected this possibility as the data were too intertwined. Consequently, we have opted to considerably reduce the first part, and moved OV Figs. 2 and 4 into the supplementals [NV Figs. S2, S6]. We would prefer not to remove data altogether as they are likely to have community value even if they are negative and as noted, they are of good quality.

      In the current form, the first part and the second part of the manuscript remain somewhat detached from each other. The characterization of the lipid binding/membrane binding properties has a number of substantial weaknesses (e.g. use of quite different, nonphysiological buffers for membrane binding assays; use of deletion mutants for the binding assays, which do not show the full potential of oligomerization). This which makes it hard to read and confuses the reader. Even though I have no reason to doubt the conclusions by the authors, I do not think that all necessary caution has been invested to rule out other possibilities.

      We believe that the shortening and refocusing of the manuscript should address these issues. For consideration of the buffer and deletion mutant points, please see responses to Major Points below.

      In summary, even though the technical quality of the individual performed assays is high, there are some conceptual issues that make it hard to make a strong case based on a collection of individual, clear datasets. Even though I find the structures of the MORN proteins important, timely, and interesting, I would not recommend this study for publication in its current form. The manuscript would be more fun to read if both of the parts would be shortened substantially and more focused.

      We have implemented this suggestion: the manuscript has been considerably shortened (from 20,489/135,073 to 18,555/103,988 characters/words, focused on reducing the negative lipid-binding results).

      While I agree that most evidence provided on lipid/membrane binding of TbMORN1 argue against a direct role of MORN proteins in membrane binding, I feel that the experimental approach is not coherent enough. See a few major points of criticism below.

      Major Points:

      1. The authors decide to characterize the membrane binding of a MORN repeat protein using a deletion variant that lacks the N-terminal repeat. However, in Figure 1B they show that the N-terminal repeat is important for the formation of higher-order oligomers. While I fully understand that the presence of the most N-terminal repeat does hamper the structural work, I find it problematic to remove it for the lipid/membrane-binding assays. The formation of higher oligomeric species beyond the dimer, may be important for membrane binding/recruitment (avidity effects).

      As we explained in the manuscript, the reason for not using the full-length protein for in vitro work was because it was polydisperse, and that the yields were extremely low. See OV lines 178-179 ("The yields of TbMORN1(1-15) were always very low, making this construct not generally suitable for in vitro assays".) and OV lines 411-414 ("...TbMORN1(1-15), which was polydisperse in vitro and formed large oligomers (Fig. 1B). The membrane-binding activity of these polydisperse oligomers was not possible to test in vitro, as the purification yields of TbMORN1(1-15) were always low."). Consequently, we used the longest construct that was suitable in terms of chemical and oligomeric homogeneity. Using the full-length protein would have had inherent problems with aggregation, and consequently would have compromised the data and derived results. In order to make this clear in the manuscript we edited the sentence mentioned above as follows:

      “It was not possible to test the membrane-binding activity of these polydisperse oligomers in vitro however, as the purification yields of TbMORN1(1-15) were always low. As an alternative, the possible membrane association of TbMORN1(1-15) was examined in vivo."

      2) (Related to point 1) I do not understand the choice of the buffers used for some of the assays. The use of pH 8.5 and NaCl concentrations of 200 mM are non-physiological.

      These were the buffer conditions required to retain the protein in a monodisperse state, suitable for in vitro assays.

      For CD spectroscopy, a high ionic strength was obtained by the use of 200 mM NaF. If a high ionic strength is required to prevent the formation of higher oligomers of MORN, it raises the question if the formation of higher oligomers (under physiological conditions) may also contribute to their function.

      The oligomers of TbMORN1 may indeed be the most functionally relevant form of TbMORN1 but we do not currently have a means of testing this in vitro, as acknowledged in the text (OV lines 411-414, quoted above). The aim of CD spectroscopy was to assess fold integrity and stability of different constructs; we used buffers as recommended for the CD spectroscopy experiments by Kelly et al, 2005 (doi:10.1016/j.bbapap.2005.06.005) (Table 1 and section 4.2). Furthermore, the CD spectra of TbMORN(1-15) and TbMORN(2-15) (OV Fig. S1E [NV Fig. S1E]) are basically superimposable, suggesting identical secondary structure content at the concentration used for these experiments.

      It is unclear, in which buffer the fluorescence anisotropy measurements were performed.

      We have provided details on the buffer conditions for the fluorescence anisotropy experiments in the Materials and Methods section, NV page 23, lines 962-963.

      The sucrose-loaded vesicles were hydrated in a 20 mM HEPES pH 7.4, 0.3 M Sucrose. The composition of the buffer after the addition of MORN proteins is not clear.

      The Materials and Methods are now unambiguous on this point. Please see NV lines 1036- 1046: "6 μM Rhodamine B dihexadecanoyl phosphoethanolamine (Rh-DHPE) was added to all lipid mixtures to facilitate the visualisation of the SLVs. The lipid mixtures were dried under a nitrogen stream, and the lipid films hydrated in 20 mM HEPES pH 7.4; 0.3 M sucrose. The lipid mixtures were subjected to 4 cycles of freezing in liquid nitrogen followed by thawing in a sonicating water bath at RT. The vesicles were pelleted by centrifugation (250,000 × g, 30 min, RT) and resuspended in 20 mM HEPES pH 7.4, 100 mM KCl to a total lipid concentration of 1 mM. SLVs were incubated with 1.5 μM purified TbMORN1(2-15) in gel filtration buffer (20 mM Tris-HCl pH 8.5, 200 mM NaCl, 2% glycerol, 1 mM DTT) at a 1:1 ratio (30 min, RT)." The liposomes were at physiological pH and close to physiological ionic strength.

      Despite the use of an impressive array of techniques, this first part of the manuscript remains somewhat immature and incoherent. Due to the use of constructs that have not the full ability to oligomerize (point 1) and due to the inconsistent use of experimental conditions, it is hard to draw firm conclusions from this first part.

      Any biochemical study is conducted within the constraints of the choice of construct and the choice of buffer conditions, and the data are valid within those parameters. This applies as much to positive data as to negative data, so we are not clear why the reviewer is placing such emphasis on this point. In the case of the LiMA data, which are the most unbiased and comprehensive dataset in the manuscript, these experiments were well-controlled and there were also domains present that were recruited to membranes under the buffer conditions, allowing us to rule out that the assay conditions were completely unsuitable. Validating negative results should be done as carefully and with as many orthogonal approaches as the validation of positive results. The reviewer acknowledges below that "the data point in the direction that MORN proteins (or at least TbMORN1) does not directly bind to membranes". This is the conclusion that we wanted to communicate.

      For example: In Figure 2E TbMORN(2-15) does show some concentration-dependent binding, which -however- is interpreted as background binding. What are the results using this assay (or better: a liposome floatation assay) when using full-length TbMORN(1-15) in a more physiological buffer?

      As noted already, it is not possible to use the TbMORN1(1-15) construct for in vitro assays owing to the extremely low yields and polydisperse nature of the protein. The excess fulllength protein was associated with the cytosolic fraction and not the membrane fraction in vivo (OV Fig. 6B [NV Fig. 4B]).

      The statement that MORN proteins bind to lipids, but not to liposomes/membranes is -in my view- not sufficiently addressed to make a strong case.

      At no point do we suggest that MORN repeat proteins in general bind to lipids and not to liposomes/membranes. On the contrary, and as detailed in the manuscript, we set out to assay the lipid binding activity of TbMORN1, found that it appears to bind to lipids but not to liposomes/membranes, and have therefore cautioned that lipid or liposome/membrane binding of other MORN repeat proteins must be tested experimentally before claims of function are made.

      3) The physiological relevance of lipid binding to MORN proteins remains obscure (as also acknowledged by the authors). Does the binding of PE lipids to the MORN protein have a physiological role? Does the binding of fluorescent PI(4,5)P2 point to a physiological role of MORN proteins?

      These are interesting questions that we would like to address in future work.

      4) In light of recent data from the Chris Stefan lab (PMID: 31402097) a co-incidence detection of PI(4,5)P2, PS, and cholesterol seems possible. Can the authors address this possibility?

      Again, the involvement of cholesterol, PS, and PI(4,5)P2 would be interesting questions for subsequent work but are beyond the scope of the present study. We did partially address this issue in our use of PI(4,5)P2, POPC and cholesterol containing liposomes in liposome cosedimentation assays, which showed no binding (OV Fig. S3A [NV Fig. S4A]).

      Furthermore, the role of Ca2+ signaling / Ca2+ ions has not been addressed. In light of the important role of Ca2+ for the recognition of PI(4,5)P2 (PMID: 28177616), this point should be addressed.

      We carried out liposome pelleting assays in the presence of Ca2+ and Mg2+, and saw no binding by TbMORN1(2-15) in either condition (see data below). These data were not included in the MS because of the insufficient number of technical replicates available.

      5) For characterizing the binding of lipids to MORN proteins, the authors use nonphysiological fluorescent and short-chain lipid analogues at concentrations, which are unlikely to occur for endogenous PIPs in the cytosol of cells. Why choosing such an artificial system? Why introducing this system at length, if other -less artifact-prone- assays are available? I would recommend to not feature this assay as prominently as it was in the current study.

      Our aim was to stick to using the same fluorophore throughout all the experiments. The choice of short-chain lipids was constrained by what was commercially available with the BODIPY TMR fluorophore. We have implemented the reviewer's suggestion in the manuscript, and the text associated with the fluorescence anisotropy assays has been considerably shortened. We are aware that the chosen concentration of the fluorescent lipids was out of physiological range, but the requirements of the fluorescence anisotropy itself necessitated a compromise. The possible shortcomings of the fluorescence anisotropy assays are, we believe, more than amply compensated by the LiMA data.

      6) How would PE find its way to the lipid binding region in MORN? Would it diffuse to the MORN protein via the aqueous phase or would the MORN protein pickup PE form membranes up collision? The authors should address this point, by separating the lipiddepleted MORN protein from donor-vesicles containing PE by a dialysis membrane. If PE would not find its way to the lipid binding site of MORN, this would imply that MORN protein can extract lipids only upon colliding with the membrane. What is the stoichiometry of PE to MORN?

      These are all interesting questions that we would like to pursue in subsequent work, but we feel that they are beyond the scope of the present study. Until we have conditions suitable for obtaining high yields and monodisperse populations of the full-length protein, which probably also necessitates developing conditions for controlled oligomerisation, it would be premature to start this. As to how it picks up PE: it is well known that specific lipid binding/chaperoning proteins can deliver their lipid cargo to other proteins. Additionally, proteins that bind lipids use hydrophobic domains to both interact with and sequester fatty acids and/or lipids from membranes. The literature is populated with lots of such examples. https://www.sciencedirect.com/science/article/pii/S0092867416310765.

      Despite my critique raised above, I agree with the authors that the data point in the direction that MORN proteins (or at least TbMORN1) does not directly bind to membranes. Their data, however, would still be consistent with a role as lipid transfer protein and a recruitment of MORN proteins to the membrane by other proteins. Have the authors performed any additional experiments in this direction? Also, the potential role of palmitoylation is only mentioned in the discussion (page 22), while palmitoylation would provide a simple means for membrane recruitment.

      We are glad that the reviewer concurs with our main conclusion. We agree, as noted in the discussion, that a role as a lipid transfer protein might still be possible, and this is something that we would like to pursue in follow-up work. We have not yet performed any additional experiments in this direction. Concerning palmitoylation, the predictions using the CSS-Palm software were always weak and ambiguous, and in addition the best candidate cysteine residue was Cys351, which is in our structure engaged in the disulphide bond observed in the C2 crystal form. We feel that this is something to keep in mind, but is not yet a strong enough hypothesis to pursue intensively.

      Minor Points:

      Figure 1B: The authors should provide information on the void volume of the column.

      Implemented in the figure legend (7.2 ml).

      Page 17, line 696-701: The authors point out that the C2 crystal form is stabilized by two disulfide bridges. The authors should comment on the physiological relevance of these disulfide bridges.

      Given the reducing environment of the cytosol, it is an open question as to whether these disulphide bridges exist in vivo. We would prefer not to speculate on this point, as we do not feel it would be productive.

      Page 18, line 734-740: The authors should provide data on the potential role of Zn2+ on MORN function in a physiological context. The section describing that the dimer is stabilized by Zn2+ ions (pages 18 and 19) lacks a discussion if Zn2+ are functionally relevant. There is only a beautiful sequence analysis and a discussion of the conservation of the Zn2+ coordinating residues. Can the authors perform Zn2+ titrations and SEC-MALS experiments (or alternatives such as SAXS) to show that Zn2+ indeed affects the oligomeric state of only the PfMORN, but not the other MORN proteins that form alternative dimers?

      The known requirement for zinc ions in Plasmodium growth was already noted (OV lines 992- 993, Marvin et al., 2012), and is, we believe, sufficient to address the issue of physiological relevance at this stage. The zinc ions are predicted to affect the architecture of the apicomplexan (Plasmodium, Toxoplasma) MORN1 protein dimers, not their oligomeric state. For PfMORN1, SEC-MALS and SAXS were carried out in 20 mM Tris-HCl pH 7.5, 100 mM NaCl with no zinc present. When EDTA was added, no change in behaviour of the protein was seen by SEC-MALS. When “TPEN”, a strong zinc chelator, was added, the protein precipitated in SEC-MALS experiments.

      Reviewer #1 (Significance):

      A putative role of MORN proteins in membrane and lipid binding is addressed. The view the MORN proteins bind directly to membranes is challenged. Structures of dimeric MORN proteins provide important insight into the modes of dimerization.

      There is a recent structure of MORN proteins (which is referenced by the authors), but I feel that additional structural work is important and justified. The work on membrane vs. lipid binding is important, but not sufficiently addressed in the current manuscript.

      We are glad that the reviewer finds the structural work important and justified, although we disagree with the reviewer’s assessment of the lipid binding. As noted in the previous paragraph, our data challenge the assumption that MORN repeat proteins directly bind membranes, and we feel that this alone is a significant conceptual advance.

      I would recommend to separate the study in two parts. The audience is likely to confused (or bored) by the lengthy discussion on whether or not MORN proteins bind lipids and or membrane or not.

      We would prefer to implement the reviewer's other suggestion, namely that the manuscript is considerably shortened and less focus given to the negative data on lipid binding.

      I am not an expert in structural biology, but have a fair understanding of structural biology. I have worked on lipid binding proteins and have a very good understanding of lipid/membrane-binding assays.


      Reviewer #2 (Evidence, reproducibility and clarity):

      Summary

      The manuscript describes an extensive and detailed investigation into the structure and function(s) of MORN domains. It has to be acknowledged that, despite the considerable amount of work reported, the conclusions are rather limited. From a technical viewpoint, the experiments have been appropriately executed and, generally, I concur with the conclusions drawn. However, the manuscript is over-long: in general, I would recommend concentrating on positive conclusions which can be drawn from the data and avoid excessive speculation or inference (some examples given below).

      We are glad that the reviewer is satisfied with the technical quality of the work and (in general) the validity of the conclusions. We acknowledge that the original submission was fairly long, and have considerably shortened the revised manuscript and focused more on the positive conclusions in order to implement this suggestion.

      Major Comments

      There are three general- perhaps rather obvious- points to make. First, there is no particular reason to think that conservation of structure necessarily indicates conservation of a particular function. There seems to be an implicit assumption that MORN domains are associated with a specific, well-defined biological function. Given their diversity, are there particular reasons to think that this is the case?

      The reviewer is exactly right that there is an implicit assumption that MORN domains are associated with a specific, well-defined function: specifically, lipid binding. It is this assumption, which has been widely circulated in the almost complete absence of experimental evidence, that we are challenging. We agree that MORN repeats are likely to be capable of multiple functions, and protein-protein interactions are now better supported than protein-lipid interactions.

      Second, a strategy which examines the properties of just the recombinant MORN domains in vitro, removed from the context of the whole protein (eg junctophilin) or- importantly- its interacting partners in vivo, has obvious limitations. Frequently a reductionist approach is successful; however, in this case, MORN domains appear to be less tractable to that kind of approach. For all the in vitro binding and structural experiments presented, there is always a concern that the absence of other parts of the relevant MORN-containing protein or its partners could explain failure or inconsistency of in vitro biological activity measurements.

      Again, the reviewer is right that there is an inherent contextual limitation to any in vitro work that utilises a single protein, but this is a concern that - by definition - could be raised about any in vitro study utilising a single protein. It should be noted that we have also carried out in vivo experiments using TbMORN1 (OV Figs. 5, 6 [NV Figs. 3, 4]).

      Third, the possibility that MORN domains might mediate interactions with other proteins seems to be given little consideration, in spite of the Li et al (2019) paper. An experimental strategy which looked for binding partners (eg by pulldown assay) might have provided more insight.

      These data are already in the literature. A previous study by the same team (Morriswood et al., 2013) used proximity-dependent biotin identification to identify candidate binding partners and near neighbours of TbMORN1.

      In order to stress this point we added the following sentence in the discussion section, NV pages 18-19, lines 774-778.

      “The concluding data presented here suggest that TbMORN1 utilises this oligomerisation capacity to build mesh-like assemblies, which can reach considerable size in vitro (Fig. 7G). These mesh-like assemblies may reflect the endogenous organisation of the protein in vivo, where a number of binding partners have already been identified (Morriswood et al., 2013)”.

      Minor Comments

      1. In the abstract and elsewhere the authors refer to a possible function of MORN domains as 'dimerisation and oligomerisation devices' (line 53). What is the evidence that dimer formation is important for function in vivo?

      This is an interesting and important question and one that we would like to address in future work. We did attempt to generate trypanosome cell lines that inducibly expressed monomeric TbMORN1 (the double mutant, where the point mutations were simultaneously introduced in the dimerisation interface in repeats 13 and 14), but no expression of the ectopic protein was ever observed (9 separate clones obtained in 3 independent transfections). This might indicate the importance of the dimeric state in vivo, perhaps hinting that dimerisation is important for protection from degradation. In general, proteins assuming higher oligomeric states in homo- or heteromeric assemblies benefit from increased robustness in the cellular environment and optimised activity by the following means:

      • Increased stability by decreasing the surface area/volume ratio
      • Simple construction of larger complexes
      • Allosteric regulation
      • Co-localisation of distinct biological functions
      • Substrate channelling
      • Protection from aggregation or degradation

      Which or which combination of the factors is relevant for TbMORN1 being a functional dimer in vivo is difficult to say at this point.

      1. Did the authors attempt to co-crystallize TbMORN1(7-15) with PI(4,5)P2?

      No. For crystallisation, we used lysine methylated samples, and by doing this we neutralised positively-charged potential binding sites which would have interacted with the negatively charged lipid headgroup. We did not observe any bound lipids in the electron density maps obtained from the crystals.

      1. Fig 2C: did the authors also estimate binding stoichiometry as well as the equilibrium binding constants for these data? This should be determined by fitting a single binding site model to the data. Other methods (eg ITC) can probably determine this with more accuracy. The value of stoichiometry is sometimes forgotten in such binding measurements- is one ligand bound per monomer or dimer, for example?

      We discussed estimation of the binding stoichiometry in the fluorescence anisotropy assays at some length, but the conclusion was that the required experiments would contain too many approximations to provide high-confidence data. We did use ITC and also MST, but did not observe any binding with these assays.

      1. Lines 674-678 I found it hard to work out whether these constructs harbour the natural C-terminal sequence without truncation or addition of an affinity tag. I think the answer is 'yes' but it was difficult working this out from the details in M&M.

      TbMORN1(7-15) crystallisation was with a C-terminal Strep tag; TgMORN1(7-15) and PfMORN1(7-15) had their affinity tags removed by protease treatment prior to crystallisation. We have clarified this point in the M&M, page 29, lines 1189-1192: “Crystallisation of TbMORN1(7-15) (with a C-terminal Strep tag), TgMORN1(7-15) and PfMORN1(7-15) (both with affinity tags removed) was performed at 22 °C using a sitting-drop vapour diffusion technique and micro-dispensing liquid handling robots (Phoenix RE (Art Robbins Instruments) and Mosquito (TTP labtech).”

      1. Lines 688-694 The PISA interface analysis is useful here in distinguishing crystal contacts from those which persist in solution. The discussion of the results is unclear, however, on this critical point: were the dimer interfaces the only contacts which were significant in the various crystal forms?

      Yes, correct. PISA showed that the described dimerisation contacts were the only significant ones in the various crystal forms. Other crystals contacts had typically low P-values and poor ΔG and small “radar” surface in the complexive PISA analysis.

      In the case of both TbMORN1 crystal forms and in the case of the TgMORN1 P43212 crystal form we have a dimer in the asymmetric unit, while in the case of the PfMORN1 and TgMORN1 P6222 form we have one molecule in the asymmetric unit, and the dimer is created by the crystallographic twofold axis. In the latter cases the quaternary structure resulting from the symmetry operations was the top-scoring one considering either P-values and/or the number of stabilising interactions buried surface area.

      1. Lines 754-763 This paragraph seems rather speculative and is a good example where the text could be cut down.

      If the line citation is correct, then we disagree with this assessment and would prefer not to implement it. The paragraph in question concerns a detailed and very precise discussion of the side chain interactions that stabilise the V-shaped forms of TgMORN1 and PfMORN1.

      1. Line 765-788 This section is also rather overdone: such observations are only useful if they are subsequently tested by recording dimer conformation for a representative selection of MORN dimers from different species.

      Again, we disagree with the reviewer's assessment of this analysis. The analysis has considerable predictive power and already has some experimental validation via the SAXS observation that PfMORN1 is capable of forming extended dimers in solution (OV Fig. 10C [NV Fig. 7C]).

      1. Lines 800-801 I don't think this statement is strictly correct. The SAXS data show that PfMORN1(7-15) adopts an extended conformation, with no evidence of the 'V' shaped structure. Related to that point, from what I could glean from the SAXS Methods section, all solution conditions for these experiments were conducted without Zn2+? If some dimer interfaces require Zn2+, should it not be included?

      We have clarified this statement. The SAXS experiments were conducted without zinc, and, as we have stressed, the V-shaped form of TgMORN1 and PfMORN1 was only ever observed in the crystals. For PfMORN1, SEC-MALS and SAXS were carried out in 20 mM Tris-HCl pH 7.5, 100 mM NaCl with no zinc present. When EDTA was added, no change in behaviour of the protein was seen by SEC-MALS. When “TPEN”, a strong zinc chelator, was added, the protein precipitated in SEC-MALS experiments.

      Reviewer #2 (Significance):

      There is certainly value in establishing that MORN domains do not, in vitro, appear to bind to lipid vesicles, and to define their lipid binding capability (although it is rather complex). The crystal structures and SAXS data extend the rather limited structural data on MORN domains. Despite the effort involved, conclusions about likely functions of MORN domains in vivo are rather limited.

      We are glad that the reviewer acknowledges the value in challenging the assumption that MORN repeats are lipid binding devices, and that the structural data are important for expanding the knowledge base on this class of repeat motif proteins. In vivo functional work is being actively pursued at present.

      My expertise lies in X-ray crystallography and protein biochemistry.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary

      The manuscript describes an extensive and detailed investigation into the structure and function(s) of MORN domains. It has to be acknowledged that, despite the considerable amount of work reported, the conclusions are rather limited. From a technical viewpoint, the experiments have been appropriately executed and, generally, I concur with the conclusions drawn. However, the manuscript is over-long: in general, I would recommend concentrating on positive conclusions which can be drawn from the data and avoid excessive speculation or inference (some examples given below).

      Major Comments

      There are three general- perhaps rather obvious- points to make. First, there is no particular reason to think that conservation of structure necessarily indicates conservation of a particular function. There seems to be an implicit assumption that MORN domains are associated with a specific, well-defined biological function. Given their diversity, are there particular reasons to think that this is the case? Second, a strategy which examines the properties of just the recombinant MORN domains in vitro, removed from the context of the whole protein (eg junctophilin) or- importantly- its interacting partners in vivo, has obvious limitations. Frequently a reductionist approach is successful; however, in this case, MORN domains appear to be less tractable to that kind of approach. For all the in vitro binding and structural experiments presented, there is always a concern that the absence of other parts of the relevant MORN-containing protein or its partners could explain failure or inconsistency of in vitro biological activity measurements. Third, the possibility that MORN domains might mediate interactions with other proteins seems to be given little consideration, in spite of the Li et al (2019) paper. An experimental strategy which looked for binding partners (eg by pulldown assay) might have provided more insight.

      Minor Comments

      1. In the abstract and elsewhere the authors refer to a possible function of MORN domains as 'dimerisation and oligomerisation devices' (line 53). What is the evidence that dimer formation is important for function in vivo?
      2. Did the authors attempt to co-crystallize TbMORN1(7-15) with PI(4,5)P2?
      3. Fig 2C: did the authors also estimate binding stoichiometry as well as the equilibrium binding constants for these data? This should be determined by fitting a single binding site model to the data. Other methods (eg ITC) can probably determine this with more accuracy. The value of stoichiometry is sometimes forgotten in such binding measurements- is one ligand bound per monomer or dimer, for example?
      4. Lines 674-678 I found it hard to work out whether these constructs harbour the natural C-terminal sequence without truncation or addition of an affinity tag. I think the answer is 'yes' but it was difficult working this out from the details in M&M.
      5. Lines 688-694 The PISA interface analysis is useful here in distinguishing crystal contacts from those which persist in solution. The discussion of the results is unclear, however, on this critical point: were the dimer interfaces the only contacts which were significant in the various crystal forms?
      6. Lines 754-763 This paragraph seems rather speculative and is a good example where the text could be cut down.
      7. Line 765-788 This section is also rather overdone: such observations are only useful if they are subsequently tested by recording dimer conformation for a representative selection of MORN dimers from different species.
      8. Lines 800-801 I don't think this statement is strictly correct. The SAXS data show that PfMORN1(7-15) adopts an extended conformation, with no evidence of the 'V' shaped structure. Related to that point, from what I could glean from the SAXS Methods section, all solution conditions for these experiments were conducted without Zn2+? If some dimer interfaces require Zn2+, should it not be included?

      Significance

      There is certainly value in establishing that MORN domains do not, in vitro, appear to bind to lipid vesicles, and to define their lipid binding capability (although it is rather complex). The crystal structures and SAXS data extend the rather limited structural data on MORN domains. Despite the effort involved, conclusions about likely functions of MORN domains in vivo are rather limited. My expertise lies in X-ray crystallography and protein biochemistry.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      MORN (membrane occupation and recognition nexus) repeat proteins are found in prokaryotes and eukaryotes. They feature characteristic repeats in their primary sequence, have been assumed to play a role in lipid binding, but remain poorly characterized on the functional and structural level. This manuscript tries to address both these questions and is organized in major parts. In the first part the authors characterize a putative role of MORN repeat proteins in lipid binding and membrane association. In the second part, the authors use X-ray crystallography to establish the structure of MORN repeat proteins and to investigate the dimerization.

      As a cleverly chosen point of departure, they focus their study particularly on MORN1 from Trypanosoma brucei (TbMORN1), which is composed solely on MORN repeats. The structures of MORN repeats (from several species) in part two provide interesting insights into their mode of homotypic interactions and their role as dimerization or oligomerization devices. The lipid binding and membrane association of MORN proteins in the first part remains somewhat confusing and unclear, despite the use of a whole battery of techniques. It is questionably, why the authors invest so many figures and words to inform the reader on negative results. The authors suggest that MORN proteins can bind to lipids via their hydrophobic acyl chains- which is 'very hard to imagine under physiological conditions unless TbMORN1 is a lipid carrier and not a membrane-binding proteins.' Unfortunately, a role as lipid carrier has not been rigorously tested. In this sense the first part remains somewhat immature and incoherent. Furthermore, they suggest based on the lack-of-evidence that MORN proteins do not bind membranes in vivo and in vitro.

      The main issue of this manuscript is, in my view, the way the data were presented.The manuscript is generally well-written, but much too long. The structural work is important and concise. The first part, however, reports in five separate figures on a lack of membrane binding by a MORN protein and its ability to bind individual lipids. The physiologically relevance of this lipid binding is questionable as acknowledged by the authors. Even though I find it important that the membrane/lipid binding ability of MORN proteins is rigorously tested, I would highly recommend to separate the current manuscript in two independent stories. Alternatively, I would recommend to reduce the first part into a single figure and to remove the most artifactual assays. In the current form, the first part and the second part of the manuscript remain somewhat detached from each other. The characterization of the lipid binding/membrane binding properties has a number of substantial weaknesses (e.g. use of quite different, non-physiological buffers for membrane binding assays; use of deletion mutants for the binding assays, which do not show the full potential of oligomerization). This which makes it hard to read and confuses the reader. Even though I have no reason to doubt the conclusions by the authors, I do not think that all necessary caution has been invested to rule out other possibilities.

      In summary, even though the technical quality of the individual performed assays is high, there are some conceptual issues that make it hard to make a strong case based on a collection of individual, clear datasets. Even though I find the structures of the MORN proteins important, timely, and interesting, I would not recommend this study for publication in its current form. The manuscript would be more fun to read if both of the parts would be shortened substantially and more focused. While I agree that most evidence provided on lipid/membrane binding of TbMORN1 argue against a direct role of MORN proteins in membrane binding, I feel that the experimental approach is not coherent enough. See a few major points of criticism below.

      Major Points:

      1) The authors decide to characterize the membrane binding of a MORN repeat protein using a deletion variant that lacks the N-terminal repeat. However, in Figure 1B they show that the N-terminal repeat is important for the formation of higher-order oligomers. While I fully understand that the presence of the most N-terminal repeat does hamper the structural work, I find it problematic to remove it for the lipid/membrane-binding assays. The formation of higher oligomeric species beyond the dimer, may be important for membrane binding/recruitment (avidity effects).

      2) (Related to point 1) I do not understand the choice of the buffers used for some of the assays. The use of pH 8.5 and NaCl concentrations of 200 mM are non-physiological. For CD spectroscopy, a high ionic strength was obtained by the use of 200 mM NaF. If a high ionic strength is required to prevent the formation of higher oligomers of MORN, it raises the question if the formation of higher oligomers (under physiological conditions) may also contribute to their function. It is unclear, in which buffer the fluorescence anisotropy measurements were performed. The sucrose-loaded vesicles were hydrated in a 20 mM HEPES pH 7.4, 0.3 M Sucrose. The composition of the buffer after the addition of MORN proteins is not clear. Despite the use of an impressive array of techniques, this first part of the manuscript remains somewhat immature and incoherent. Due to the use of constructs that have not the full ability to oligomerize (point 1) and due to the inconsistent use of experimental conditions, it is hard to draw firm conclusions from this first part. For example: In Figure 2E TbMORN(2-15) does show some concentration-dependent binding, which -however- is interpreted as background binding. What are the results using this assay (or better: a liposome floatation assay) when using full-length TbMORN(1-15) in a more physiological buffer? The statement that MORN proteins bind to lipids, but not to liposomes/membranes is -in my view- not sufficiently addressed to make a strong case.

      3) The physiological relevance of lipid binding to MORN proteins remains obscure (as also acknowledged by the authors). Does the binding of PE lipids to the MORN protein have a physiological role? Does the binding of fluorescent PI(4,5)P2 point to a physiological role of MORN proteins?

      4) In light of recent data from the Chris Stefan lab (PMID: 31402097) a co-incidence detection of PI(4,5)P2, PS, and cholesterol seems possible. Can the authors address this possibility? Furthermore, the role of Ca2+ signaling / Ca2+ ions has not been addressed. In light of the important role of Ca2+ for the recognition of PI(4,5)P2 (PMID: 28177616), this point should be addressed.

      5) For characterizing the binding of lipids to MORN proteins, the authors use non-physiological fluorescent and short-chain lipid analogues at concentrations, which are unlikely to occur for endogenous PIPs in the cytosol of cells. Why choosing such an artificial system? Why introducing this system at length, if other -less artifact-prone- assays are available? I would recommend to not feature this assay as prominently as it was in the current study.

      6) How would PE find its way to the lipid binding region in MORN? Would it diffuse to the MORN protein via the aqueous phase or would the MORN protein pickup PE form membranes up collision? The authors should address this point, by separating the lipid-depleted MORN protein from donor-vesicles containing PE by a dialysis membrane. If PE would not find its way to the lipid binding site of MORN, this would imply that MORN protein can extract lipids only upon colliding with the membrane. What is the stoichiometry of PE to MORN?

      Despite my critique raised above, I agree with the authors that the data point in the direction that MORN proteins (or at least TbMORN1) does not directly bind to membranes. Their data, however, would still be consistent with a role as lipid transfer protein and a recruitment of MORN proteins to the membrane by other proteins. Have the authors performed any additional experiments in this direction? Also, the potential role of palmitoylation is only mentioned in the discussion (page 22), while palmitoylation would provide a simple means for membrane recruitment.

      Minor Points:

      Figure 1B: The authors should provide information on the void volume of the column.

      Page 17, line 696-701: The authors point out that the C2 crystal form is stabilized by two disulfide bridges. The authors should comment on the physiological relevance of these disulfide bridges.

      Page 18, line 734-740: The authors should provide data on the potential role of Zn2+ on MORN function in a physiological context. The section describing that the dimer is stabilized by Zn2+ ions (pages 18 and 19) lacks a discussion if Zn2+ are functionally relevant. There is only a beautiful sequence analysis and a discussion of the conservation of the Zn2+ coordinating residues. Can the authors perform Zn2+ titrations and SEC-MALS experiments (or alternatives such as SAXS) to show that Zn2+ indeed affects the oligomeric state of only the PfMORN, but not the other MORN proteins that form alternative dimers?

      Significance

      A putative role of MORN proteins in membrane and lipid binding is addressed. The view the MORN proteins bind directly to membranes is challenged. Structures of dimeric MORN proteins provide important insight into the modes of dimerization.

      There is a recent structure of MORN proteins (which is referenced by the authors), but I feel that additional structural work is important and justified. The work on membrane vs. lipid binding is important, but not sufficiently addressed in the current manuscript.

      I would recommend to separate the study in two parts. The audience is likely to confused (or bored) by the lengthy discussion on whether or not MORN proteins bind lipids and or membrane or not.

      I am not an expert in structural biology, but have a fair understanding of structural biology. I have worked on lipid binding proteins and have a very good understanding of lipid/membrane-binding assays.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      R-We would like to thank the reviewers for their constructive feedback. We respond to all the reviewers points below. We highlighted major changes introduced to the manuscript in response to both reviewers’ comments in the attached revised version of the manuscript.

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      The work described in this manuscript by Tan and Marques aims to address if the splicing of enhancer-associated long noncoding RNAs (elncRNAs) has a direct impact on enhancer activity or just reflects their cognate's enhancer's high activity.

      For this purpose, the authors started by integrating RNA-seq data for human lymphoblastoid cell lines with ENCODE enhancer annotations and ChIP-seq data for enhancer function-associated chromatin modifications to show that multi-exonic elncRNAs are more transcriptionally active than single-exonic elncRNAs and eRNAs. They then show that regions flanking elncRNA splice sites are enriched in splicing-associated sequence elements and that these are under stronger purifying selection (suggesting some functional relevance), both when compared to promoter-associated lncRNAs. They also show the concomitance of cis-disrupted splicing in elncRNAs and drops in expression in their target genes. Finally, they use causal inference analysis of joint seQTLs and joint scQTLs to investigate the causal relationship between splicing of elncRNAs and expression of putative gene targets and chromatin states at elncRNA cognate enhancers, respectively. They conclude that, in both cases, most associations are causally mediated by splicing of elncRNAs and therefore that this contributes to their enhancer activity.

      This manuscript is generally well written, targets an original question and potentially sets the seeds for a new exciting line of research on transcriptional regulation, by providing some evidence for the functional relevance of the splicing of elncRNAs. However, the overlooking of some important aspects of the regulation of RNA splicing led to biases in the design of data analyses and in the interpretation of the biological implications of some results that need to be dealt with before the described work can be considered sound enough for publication.

      R:We would like to thank the reviewer for taking the time to assess our manuscript and for the constructive comments.

      **Essential revisions:**

      1.Results, page 10, lines 21-24: The statement that "the impact of SS variants on gene splicing efficiency depends on the total number of alternative transcripts and exons" is not properly substantiated. The four examples given in Figure S3 do not illustrate any dependence or trend. If such dependence is "expected" the underlying concept must be explained, i.e. why the impact of SS variants on splicing efficiency should depend on the number of alternative transcripts and exons. The reader is unrealistically expected to be used to the chosen splicing efficiency metric to intuit its dependence on the number of exons. Moreover, it is not obvious where the dependence on the number of alternative transcripts comes from, particularly given that alternative splicing (e.g. the skipping of a neighbouring exon, if internal) is not profiled.

      R-In the previous version of the manuscript, we estimated gene level changes in splicing, which includes all alternative splicing events within a gene. Therefore, the more exons an elncRNA has, the more “diluted” we expect the overall impact of SS variant on elncRNA splicing efficiency to be. After considering the reviewer’s comment, we realized that only splicing events directly impacted by the SS variant should be considered in this analysis.

      In the revised version of the manuscript, we considered only alternative splicing events that include the splice donor acceptor site changed by the SS variant, and are therefore a direct consequence, of the SS variant. As suggested by the reviewer, for these splicing events, we report the fold difference in Percentage-Spliced-In (PSI) (estimated by Leafcutter (Li et al. 2018)) between samples that carry reference and alternative alleles at these SS variants. To further illustrate these changes, we now include a diagram, for each SS variant, with differential splicing information and the fold difference in PSI for each affected splicing events (Figure 3B,C, Supplementary Figure S3). In addition, the overall change across all affected splicing events is also plotted in Figure 3D and Supplementary Figure S4.

      We have modified this section to account for this and the next comment from the reviewer.

      To estimate the impact of SS variant on splicing efficiency, we calculated the Percentage-Spliced-In (PSI) (Li et al. 2018) per individual and for each elncRNA splicing event involving the splice donor or acceptor site disrupted by the SS variants (Figure 3B,C, Supplementary Figure S3). PSI measures exon inclusion and considers spliced reads spanning exon junctions (Li et al. 2018). We compared the average difference in PSI, as a proxy for change in splicing efficiency, of all affected splicing events between individuals that carry the reference and alternative canonical splice donor/acceptor sites (GT-AG). Alongside decreased exon inclusion, SS variants can also promote exon skipping events (Figure 3B,C, Figure S3). Despite some increase in exon skipping, SS variants are associated with an overall decrease in splicing efficiency (Figure 3D and Supplementary Figure S4).” (Page 10).

      Along these same lines, and more importantly, why haven't the authors looked at the possibility that a variant disrupting a splice site would lead to skipping of the neighbouring exon (if internal)? Given how the spliceosome operates (in terms of intron and exon recognition), wouldn't this be the most likely scenario? When calculating the splicing index, are reads spanning junctions between non-consecutive exons considered? Otherwise, not profiling alternative isoforms generated by exon skipping will necessarily bias splicing efficiency quantifications by overlooking fully efficient splicing associated with such isoforms. Similarly, how did the authors make sure that splicing changes did not bias elncRNA expression estimates? How was the effective transcript length determined for the calculation of RPKMs? The authors need to make these methodological clarifications, as well as why exon skipping was not considered as a splicing disruption with potential functional implications. Calculating the percent spliced-in (PSI) for all internal exons would be much informative.

      R-Regarding the methodology, what we refer to as splicing efficiency is Percentage-Spliced-In (PSI). We calculated PSI for all, including alternative, splicing events. We now make this clearer throughout the manuscript and in the figure axis/legends.

      As detailed in the methods section, to minimize the impact of alternative splicing on gene expression estimates, we quantified expression at the gene, and not at the transcript, level using HTSeq across all annotated exons. This approach allows us to assess elncRNA and target gene expression while masking differences in alternative transcript abundance, which are not relevant in the context of this analysis.

      As suggested by the reviewer, instead of considering PSI of all possible splicing events of the gene, in the revised version of the manuscript, we considered only splicing events that are directly impacted by the SS variant. This change does not impact our conclusions, but certainly provides a better understanding of how SS variants impact splicing and we would like to thank the reviewer for raising this point. As predicted by the reviewer and as expected given how the spliceosome operates, exon skipping is a frequent outcome of SS variants. However, the increase in exon skipping is not sufficient to compensate for the decrease in the inclusion of these exons, which is directly impacted by the SS variants. This is demonstrated by the lower overall splicing efficiency for each elncRNAs in individuals that carry SS variants that disrupt canonical splice/donor acceptor sites (Figure 3D and Supplementary Figure S4).

      3.All results in panels 3B-F are presented as fold differences. It is actually not clear what those differences refer to. For instance, the grey boxes are the distributions of the fold differences in splicing index / expression between individuals carrying reference alleles and what?

      R-The boxplots represent the distribution of the fold difference in PSI or expression for each individual relative to the median PSI or expression in individuals with the reference genotype. As expected, the distribution of log fold difference in either PSI or expression for individuals carrying the reference allele is centered at 0.

      We have clarified this in the methods section and figure legends.

      4.It is expectable that most joint seQTLs result from variants directly impacting splicing in cis. As the quantification of splicing is noisier than that of expression, a stronger effect is required for the detection of an sQTL than an eQTL. In other words, joint seQTLs are essentially sQTLs. This illustrated by the example in Figure 4A, with the SNP in an intronic region of the elncRNA being associated with strong differences in splicing and tiny (R-We agree with the reviewer that the quantification of splicing is noisier than that of expression. However, and in contrast with the reviewer’s hypothesis, higher “noise” in splicing quantification compared to expression led to weaker associations between splicing and seQTLs, as illustrated in the figure below. This is in line with splicing being measured with higher error rate, which would ultimately lead to smaller detectable sQTL effect than what they would be with perfect measurements. This also demonstrates that since a priori, eQTLs association are stronger, if a bias exists in the causal inference analysis, it should favour detection of non-causal associations. Therefore, our approach is not biased in detecting causal seQTLs.

      We agree with the reviewer that this potential bias may be a concern to readers and should be addressed. We have added this analysis to the text (Supplementary Figure S7E) and explained why the causal inference testing approach is not biased in detecting causal seQTLs.

      “To assess whether this approach was biased towards the detection of causal seQTLs we compared the slope and adjusted p-value of the associations between the variant and splicing or expression for all causal seQTLs. As illustrated in Supplementary Figure S7E this analysis revealed there is no evidence that stronger sQTLs would favour causal model predictions.” (Page 16).

      5.It is not totally clear what message the authors intend to convey with the result of panel 4D. Are they talking about the relative position of the variant to the elncRNA transcript or the target transcript? If the former, shouldn't the known synergy between transcription and 5´ end splicing reflect on elncRNA expression? If the latter, it is not obvious how the result connects to the mentioned synergy.

      R:In Figure 4D, we show the relative position, within a transcript, of the exonic splicing junction which is associated with causal seQTLs. The enrichment in associations with splicing junctions located at 5’ end of elncRNAs is consistent with the synergy between 5’ end splicing and transcription. We clarify this in the text:

      Importantly, 90% of seQTL associations that support elncRNA splicing as a mediator of target expression are associated with splicing junctions located at the 5´ end of the transcript, which is consistent with the known synergy between transcription and 5´ end splicing (Furger et al. 2002; Damgaard et al. 2008)(Figure 4D).” (Page 17).

      **Proofreading edits:**

      R:We would like to thank the reviewer for identifying all the typos listed below. We have corrected them in the revised version of the manuscript.

      6.Introduction, page 3, line 13: double "in".

      7.Figure S2A, leftmost panel X-axis label: "intrno" instead of "intron".

      8.Results, page 10, line 30: remove "of".

      9.It is 5´ and 3´(prime) not 5' and 3' (apostrophe).

      **Other suggestions:**

      10.Violin plots (with included boxplots) would more comprehensively convey the differences in distributions than the chosen notched boxplots.

      R-We thank the reviewer for this suggestion. Although we appreciate the added information a violin plot can provide, this also renders, in our opinion, their interpretation less intuitive. Because boxplots are simpler and easier to interpret, after consideration, we decided to continue using these to represent the distribution of the data.

      Reviewer #1 (Significance (Required)):

      It is hard for me to assess the significance of this work (beyond some evidence for the potential functional relevance of the splicing of elncRNAs) until the aforementioned concerns are addressed but it is of potential interest to the broad RNA research community.

      I am a computational biologist with experience in the analysis of high-throughput transcriptomic data and a focus on transcriptional and alternative splicing regulation.

      ========================================================================

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      This manuscript addresses an interesting question - whether the splicing of transcriptional enhancer-associated RNAs influences their transcriptional enhancement activity. The analyses appear carefully done, using appropriate datasets and statistical methods. The authors find, for example, that marks of active chromatin are enriched near spliced elncRNAs, that splicing-related motifs of elncRNAs are under selective constraint, and that splicing of elncRNAs is associated with higher elncRNA expression and to very slightly higher expression of target genes.

      R:We thank the reviewer for the constructive feedback on our manuscript. We have extended our analysis to address the reviewers concerns that we detail in our response to the comments below.

      However, I did not find the main results convincing of the main conclusion for the following reasons:

      1.The most direct evidence is shown in Figure 3, where SNPs that occur in 3' splice sites of elncRNA introns are explored, and it is shown that variants predicted to disrupt splicing of elncRNA introns are associated with reduced expression of target but not non-target genes. But the fold difference in expression of target genes is extremely small - a few percent - and is actually less than the fold difference in expression of the elncRNAs themselves (which appears closer to 10%), raising the question of whether elncRNA expression rather than splicing may be more important for activity. Furthermore, the entire analysis has an anecdotal quality, being based on only 4 splice-disrupting elncRNA variants. I did not find the figure at all convincing of the conclusion the authors draw from it.

      R:We agree with the reviewer that our analysis is limited by the available genotyping data that is restricted to common genetic variants. Our evolutionary constraint analysis (Figure 2) indicates that variants that disrupt elncRNA splicing are depleted by natural selection and so we expected to identify a relatively small number of elncRNAs (n=4) suitable for this analysis. Despite the anticipated challenges in identifying elncRNA splice site mutations, we nevertheless believe this unbiased natural mutational analysis is analogous to experimentally disrupting splice sites of these 4 elncRNA candidates.

      Regarding the strength of impact of splicing on target expression: in the absence of a comparable experiment, we could not anticipate the magnitude of the effect. We acknowledge that previous studies, which sought to completely remove splicing by either deleting all elncRNA introns (Yin et al. 2015) or terminating transcription after its 1st exon (Engreitz et al. 2016), were both associated with significantly stronger impact on elincRNA splicing and target expression than what we report here. The analysis we present here involves single nucleotide polymorphisms and so it is not surprising to have resulted in more moderate impact on overall splicing. Furthermore, whether the differences in the impact on target expression between this and previous analysis is the result of stronger effect of complete removal of splicing or a consequence of the genetic changes introduced remains unclear. The small yet consistent decrease in target expression we observed, even with minimal changes in splicing of an unbiased set of 4 candidates, is in our opinion strong evidence that modulation in elncRNA splicing is sufficient to impact, albeit moderately, target expression.

      Importantly, we replicated the impact of decreased splicing on target expression of the 4 elncRNA candidates using 89 samples of Yoruba (YRI) population from the Geuvadis dataset (Supplementary Figure S5). The robustness of the mutational study consistently supports the physiologically relevant effect of elncRNA splicing on cognate enhancer function.

      As pointed out by the reviewer, elncRNA SS variants led to stronger impact on the expression of the elncRNAs compared to that of their targets (Figure 3F,H and Supplementary Figure S4), which suggests that target expression regulation is likely a consequence of changes in elncRNA expression as a result of changes in its splicing. This is described as our working model in the discussion section of the manuscript.

      2.Figure 4 uses a causal inference approach and involves larger datasets. While causal inference can be a useful tool to identify candidate causal relationships, it does not prove causality, which still requires some sort of experimental perturbation. Thus, I found these results suggestive but still not satisfying to justify, e.g., the title of the paper or claims made in the abstract. As in Figure 3, the specific example shown in Fig. 4A again shows a relatively tiny effect on target gene expression, which again appears to be a few percent at most.

      R:For the reasons explained above, we had no expectation that the effect size of the association between elncRNA splicing and target expression would be high. It is nevertheless key that these associations are robust, which would provide reliable support for our hypothesis. To assess this, we used 2 independent datasets to replicate elncRNA target associations with sQTL variants associated with elncRNA splicing: 1) 147 LCL samples from GTEx and 2) 31,684 blood samples from eQTLgen. Using these datasets, we replicated the association between sQTL and target expression for targets of up to 77% of elncRNAs. As expected, replicated associations have significantly higher effect size in both datasets (Supplementary Figure S9) and 1.2 times more associations can be replicated in the eQTLgen blood samples with a larger cohort size. LCL-specific effect of elncRNA splicing likely explains why not all associations are replicated in these blood samples. We report these analyses in the manuscript (Supplementary Figure S9).

      We agree with the reviewer that the causal inference analysis is only suggestive per se. However, we would argue that conclusions of the present manuscript do not rely on this analysis alone, but instead on the combined evidence of several experiments, including the natural mutational analysis that is analogous to the experiment the reviewer proposes.

      Considering the reviewers concern, we realized that previous version of Figure 4A did not reflect the average strength of the association between seQTL variant and target expression (median=0.319, ranging from 0.16 to 0.81, Rebuttal Figure 1). For this reason, we replaced the previous illustration by a more representative example (Figure 4A).

      The text illustrating reproducibility of our results in GTEx and eQTLgen have been added to Page 17 of the manuscript.

      We used two independent datasets to assess the robustness of elncRNA target association with sQTL variants we predict to be associated with the splicing of these elncRNAs in LCLs. Using a smaller cohort of LCLs (n=147 (GTEx Consortium 2013)), we found a significant association in the same direction between sQTL and target expression for targets of 70% of elncRNAs (45% of variants). A larger fraction of associations (77% of elncRNAs and 52% of variants) could be replicated in a larger cohort of blood samples (n=31,684 (Võsa et al. 2018)). The difference in size between these two cohorts is likely to explain the difference in replication rate. The association between elncRNA splicing variants and target expression that were replicated have significantly higher effect size relative to non-replicated associations (Supplementary Figure S9). Furthermore, LCL-specific effect also likely explains why not all associations can be replicated in the large blood cohort.” (Page 17).

      3.Figure 2 shows that splicing-related signals are under selective constraint in spliced elncrRNAs, which is convincing. But this does not prove that splicing of elncRNAs is directly related to enhancer activity. It is equally plausible that elncRNA expression directly impacts enhancer activity and that elncRNA splicing is conserved because it boost elncRNA expression, for example.

      R:The reviewer is right and the sentence “If splicing of elncRNAs is important for enhancer function, …” does not faithfully describe the conclusions that can be drawn from the analysis reported in Figure 2. This portion of the text now reads: “If splicing of elncRNAs is functionally relevant, one would expect selection to have prevented the accumulation of deleterious mutations in their splicing-associated motifs during evolution” (Page 8). We would like to thank the reviewer for pointing this out.

      Other points:

      4.Are the ChiP profiles in Figures 1A-E significantly different from each other in a statistical sense? Probably yes, but a specific test should be done.

      R:We now added boxplots representing the distribution of read density centered at transcript promoters. Statistical difference in the distribution is also tested. We show this in the revised Figure 1A-E and Supplementary Figure S1B-C.

      5.This sentence (p. 10) was hard to follow and should be clarified: "As expected, the impact of SS variants on gene splicing efficiency depends on the total number of alternative transcripts and exons and ranges from 11% to 24% for elncRNA with 6 to 2 number of exons, respectively (Supplementary Figure S3)."

      R:We had previously estimated the average amount of change in splicing for all alternative splicing events at each elncRNA candidate. To calculate this, we considered the difference in Percentage-Spliced-In (PSI) for all splicing events and divided this by the total number of considered events. Given that only a subset of events is affected by a splice site variant, the more exons an elncRNA has, the more alternative splicing events are likely to occur and the lower the average impact of a SS variant on overall gene splicing efficiency is expected to be. Following a comment from reviewer 1 (comment 1), we now only consider splicing events directly disrupted by the SS variant. We agree this sentence was not clear and have removed it from the manuscript.

      6.Related to point 5 above, Supplementary Figure 3 is somewhat confusing because two splicing change and three expression change plots are shown for each locus, without labels of what each one is, or explanation of what the red and green colors mean.

      R:We apologize for the confusion and thank the reviewer for pointing this out. In the figure, we plot the fold difference in elncRNA splicing, target gene splicing, target gene expression, non-target gene expression, and elncRNA expression. elncRNA features are plotted in red and target gene features are plotted in green. We have added labels to clarify the relevant plots (Figure 3D-H, Supplementary Figure S4,5).

      **Minor points:**

      1.Top of p. 16: "90% of those that support elncRNA splicing as a mediator 3 of target expression are located at the 5' end of the transcript, which is 4 consistent with the known synergy between transcription and 5' end splicing" - a reference is needed

      R:We thank the reviewer for pointing this out and we have now added the appropriate reference.

      Importantly, 90% of seQTL associations that support elncRNA splicing as a mediator of target expression are associated with splicing junctions located at the 5´ end of the transcript, which is consistent with the known synergy between transcription and 5´ end splicing (Furger et al. 2002; Damgaard et al. 2008)(Figure 4D).” (Page 17)

      2.Figure 5B,C y-axes indicate "fold difference", but scales include negative numbers, which is confusing. Probably should redo the analysis showing log of fold difference.

      R:We thank the reviewer for the suggestion. Since the fold difference in Percentage-Spliced-In (PSI) used to estimate the amount of splicing at each exon junction can be of both positive and negative values, we now plot the log modulus transformation (John and Draper, 1980) of the data, which is equivalent to a log transformation while preserving the sign of the data. The analysis has been redone for Figure 3D-H, 5B,C, and Supplementary Figure S4, S5. This change does not impact the conclusions and makes the interpretation of the results more intuitive.

      3.p. 20 Describes U1 snRNP as "a protein essential for the 4 recognition of nascent RNA 5' splice site and assembly of the spliceosome". U1 is a large RNA-protein complex, not a protein.

      R:We thank the reviewer for pointing this out and this has now been corrected.

      Chromatin-bound lncRNAs have been recently shown to be enriched in U1 small nuclear ribonucleoprotein (snRNP) RNA-protein complex, a protein essential for the recognition of nascent RNA 5´ splice site and assembly of the spliceosome (Yin et al. 2020).” (Page 22)

      4.Typo: p. 11, l. 2 missing word (genes): "expression levels of other nearby was unaffected"

      R:This has been corrected.

      Reviewer #2 (Significance (Required)):

      The question addressed is very interesting, given recent work the significance of transcription from enhancers, and work addressing functional relationships between splicing and expression. The work is suggestive of effects of enhancer splicing on expression but I did not find it fully convincing as the effects observed are extremely small, and other explanations are not ruled out, as discussed above.

      Prior literature has shown that many active enhancers are transcribed, that enhancer transcription can preced and is positively correlated with target gene expression, and work from both Ulitsky and from the authors indicates that splicing of enhancer-associated lncRNAs is positively correlated with enhancer activity. A variety of studies have also shown that splicing of protein-coding genes generally has a strong positive effect on gene expression. Here, the authors attempt to go further and show that splicing of enhancers causes increased transcriptional enhancement of target genes. A variety of public genotype, expression, chromatin and other types of data are analyzed to address this question. The statistical genetics crowd may find the work of interest, but molecular biologists will not be convinced of the conclusions. My expertise is in computational biology, genomics and RNA biology.

      Engreitz JM, Haines JE, Perez EM, Munson G, Chen J, Kane M, McDonel PE, Guttman M, Lander ES. 2016. Local regulation of gene expression by lncRNA promoters, transcription and splicing. Nature 539: 452-455.

      John, J., & Draper, N. (1980). An Alternative Family of Transformations. Journal of the Royal Statistical Society. Series C (Applied Statistics), 29(2), 190-197. doi:10.2307/2986305

      Li YI, Knowles DA, Humphrey J, Barbeira AN, Dickinson SP, Im HK, Pritchard JK. 2018. Annotation-free quantification of RNA splicing using LeafCutter. Nature genetics 50: 151-158.

      Yin Y, Yan P, Lu J, Song G, Zhu Y, Li Z, Zhao Y, Shen B, Huang X, Zhu H et al. 2015. Opposing Roles for the lncRNA Haunt and Its Genomic Locus in Regulating HOXA Gene Activation during Embryonic Stem Cell Differentiation. Cell stem cell 16: 504-516.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      This manuscript addresses an interesting question - whether the splicing of transcriptional enhancer-associated RNAs influences their transcriptional enhancement activity. The analyses appear carefully done, using appropriate datasets and statistical methods. The authors find, for example, that marks of active chromatin are enriched near spliced elncRNAs, that splicing-related motifs of elncRNAs are under selective constraint, and that splicing of elncRNAs is associated with higher elncRNA expression and to very slightly higher expression of target genes.

      However, I did not find the main results convincing of the main conclusion for the following reasons:

      1.The most direct evidence is shown in Figure 3, where SNPs that occur in 3' splice sites of elncRNA introns are explored, and it is shown that variants predicted to disrupt splicing of elncRNA introns are associated with reduced expression of target but not non-target genes. But the fold difference in expression of target genes is extremely small - a few percent - and is actually less than the fold difference in expression of the elncRNAs themselves (which appears closer to 10%), raising the question of whether elncRNA expression rather than splicing may be more important for activity. Furthermore, the entire analysis has an anecdotal quality, being based on only 4 splice-disrupting elncRNA variants. I did not find the figure at all convincing of the conclusion the authors draw from it.

      2.Figure 4 uses a causal inference approach and involves larger datasets. While causal inference can be a useful tool to identify candidate causal relationships, it does not prove causality, which still requires some sort of experimental perturbation. Thus, I found these results suggestive but still not satisfying to justify, e.g., the title of the paper or claims made in the abstract. As in Figure 3, the specific example shown in Fig. 4A again shows a relatively tiny effect on target gene expression, which again appears to be a few percent at most.

      3.Figure 2 shows that splicing-related signals are under selective constraint in spliced elncrRNAs, which is convincing. But this does not prove that splicing of elncRNAs is directly related to enhancer activity. It is equally plausible that elncRNA expression directly impacts enhancer activity and that elncRNA splicing is conserved because it boost elncRNA expression, for example.

      Other points:

      4.Are the ChiP profiles in Figures 1A-E significantly different from each other in a statistical sense? Probably yes, but a specific test should be done.

      5.This sentence (p. 10) was hard to follow and should be clarified: "As expected, the impact of SS variants on gene splicing efficiency depends on the total number of alternative transcripts and exons and ranges from 11% to 24% for elncRNA with 6 to 2 number of exons, respectively (Supplementary Figure S3)."

      6.Related to point 5 above, Supplementary Figure 3 is somewhat confusing because two splicing change and three expression change plots are shown for each locus, without labels of what each one is, or explanation of what the red and green colors mean.

      Minor points:

      1.Top of p. 16: "90% of those that support elncRNA splicing as a mediator 3 of target expression are located at the 5' end of the transcript, which is 4 consistent with the known synergy between transcription and 5' end splicing" - a reference is needed

      2.Figure 5B,C y-axes indicate "fold difference", but scales include negative numbers, which is confusing. Probably should redo the analysis showing log of fold difference.

      3.p. 20 Describes U1 snRNP as "a protein essential for the 4 recognition of nascent RNA 5' splice site and assembly of the spliceosome". U1 is a large RNA-protein complex, not a protein.

      4.Typo: p. 11, l. 2 missing word (genes): "expression levels of other nearby was unaffected"

      Significance

      The question addressed is very interesting, given recent work the significance of transcription from enhancers, and work addressing functional relationships between splicing and expression. The work is suggestive of effects of enhancer splicing on expression but I did not find it fully convincing as the effects observed are extremely small, and other explanations are not ruled out, as discussed above.

      Prior literature has shown that many active enhancers are transcribed, that enhancer transcription can preced and is positively correlated with target gene expression, and work from both Ulitsky and from the authors indicates that splicing of enhancer-associated lncRNAs is positively correlated with enhancer activity. A variety of studies have also shown that splicing of protein-coding genes generally has a strong positive effect on gene expression. Here, the authors attempt to go further and show that splicing of enhancers causes increased transcriptional enhancement of target genes. A variety of public genotype, expression, chromatin and other types of data are analyzed to address this question. The statistical genetics crowd may find the work of interest, but molecular biologists will not be convinced of the conclusions. My expertise is in computational biology, genomics and RNA biology.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      The work described in this manuscript by Tan and Marques aims to address if the splicing of enhancer-associated long noncoding RNAs (elncRNAs) has a direct impact on enhancer activity or just reflects their cognate's enhancer's high activity.

      For this purpose, the authors started by integrating RNA-seq data for human lymphoblastoid cell lines with ENCODE enhancer annotations and ChIP-seq data for enhancer function-associated chromatin modifications to show that multi-exonic elncRNAs are more transcriptionally active than single-exonic elncRNAs and eRNAs. They then show that regions flanking elncRNA splice sites are enriched in splicing-associated sequence elements and that these are under stronger purifying selection (suggesting some functional relevance), both when compared to promoter-associated lncRNAs. They also show the concomitance of cis-disrupted splicing in elncRNAs and drops in expression in their target genes. Finally, they use causal inference analysis of joint seQTLs and joint scQTLs to investigate the causal relationship between splicing of elncRNAs and expression of putative gene targets and chromatin states at elncRNA cognate enhancers, respectively. They conclude that, in both cases, most associations are causally mediated by splicing of elncRNAs and therefore that this contributes to their enhancer activity.

      This manuscript is generally well written, targets an original question and potentially sets the seeds for a new exciting line of research on transcriptional regulation, by providing some evidence for the functional relevance of the splicing of elncRNAs. However, the overlooking of some important aspects of the regulation of RNA splicing led to biases in the design of data analyses and in the interpretation of the biological implications of some results that need to be dealt with before the described work can be considered sound enough for publication.

      Essential revisions:

      1.Results, page 10, lines 21-24: The statement that "the impact of SS variants on gene splicing efficiency depends on the total number of alternative transcripts and exons" is not properly substantiated. The four examples given in Figure S3 do not illustrate any dependence or trend. If such dependence is "expected" the underlying concept must be explained, i.e. why the impact of SS variants on splicing efficiency should depend on the number of alternative transcripts and exons. The reader is unrealistically expected to be used to the chosen splicing efficiency metric to intuit its dependence on the number of exons. Moreover, it is not obvious where the dependence on the number of alternative transcripts comes from, particularly given that alternative splicing (e.g. the skipping of a neighbouring exon, if internal) is not profiled.

      2.Along these same lines, and more importantly, why haven't the authors looked at the possibility that a variant disrupting a splice site would lead to skipping of the neighbouring exon (if internal)? Given how the spliceosome operates (in terms of intron and exon recognition), wouldn't this be the most likely scenario? When calculating the splicing index, are reads spanning junctions between non-consecutive exons considered? Otherwise, not profiling alternative isoforms generated by exon skipping will necessarily bias splicing efficiency quantifications by overlooking fully efficient splicing associated with such isoforms. Similarly, how did the authors make sure that splicing changes did not bias elncRNA expression estimates? How was the effective transcript length determined for the calculation of RPKMs? The authors need to make these methodological clarifications, as well as why exon skipping was not considered as a splicing disruption with potential functional implications. Calculating the percent spliced-in (PSI) for all internal exons would be much informative.

      3.All results in panels 3B-F are presented as fold differences. It is actually not clear what those differences refer to. For instance, the grey boxes are the distributions of the fold differences in splicing index / expression between individuals carrying reference alleles and what?

      4.It is expectable that most joint seQTLs result from variants directly impacting splicing in cis. As the quantification of splicing is noisier than that of expression, a stronger effect is required for the detection of an sQTL than an eQTL. In other words, joint seQTLs are essentially sQTLs. This illustrated by the example in Figure 4A, with the SNP in an intronic region of the elncRNA being associated with strong differences in splicing and tiny (<1%) and barely significant differences in expression. Moreover, current knowledge and reported evidence strongly suggests that cis regulation of splicing is essentially "local", i.e. directly involves the processed sequences and not the interference of neighbouring RNAs. Similarly, to my knowledge there is no evidence suggesting a trend for genes encoding splicing factors being associated to the same eQTL variants as those of their target RNAs. I would therefore predict that most joint seQTLs result from variants within the elncRNA loci directly impacting their splicing. If this is the case, causal inference analysis will naturally be biased towards more strongly linking the variants with elncRNA splicing and thereby suggesting its causal role. The same rationale applies to scQTLs. The authors need to control for that potential bias in their analyses or explain why there is no bias.

      5.It is not totally clear what message the authors intend to convey with the result of panel 4D. Are they talking about the relative position of the variant to the elncRNA transcript or the target transcript? If the former, shouldn't the known synergy between transcription and 5´ end splicing reflect on elncRNA expression? If the latter, it is not obvious how the result connects to the mentioned synergy.

      Proofreading edits:

      6.Introduction, page 3, line 13: double "in".

      7.Figure S2A, leftmost panel X-axis label: "intrno" instead of "intron".

      8.Results, page 10, line 30: remove "of".

      9.It is 5´ and 3´(prime) not 5' and 3' (apostrophe).

      Other suggestions:

      10.Violin plots (with included boxplots) would more comprehensively convey the differences in distributions than the chosen notched boxplots.

      Significance

      It is hard for me to assess the significance of this work (beyond some evidence for the potential functional relevance of the splicing of elncRNAs) until the aforementioned concerns are addressed but it is of potential interest to the broad RNA research community.

      I am a computational biologist with experience in the analysis of high-throughput transcriptomic data and a focus on transcriptional and alternative splicing regulation.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Response to reviewer comment for manuscript RC-2020-00207

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      **Major Comments:**

      The authors of the paper start the paper with just one protein narrowed down ie. HRG. The rest of the paper uses affinity based proteomics, antibody validation, GWAS and survival analysis to validate this target and support their claim that HRG is an age associate protein linked to mortality and certain clinical outcomes. How did the authors conclude that HRG was the only target to explore further in this paper? What methods or analysis was done for this? What were the other proteins if any that showed up in these studies?

      We appreciate this comment which reveals unclear explanation how the protein was chosen for further analysis. The protein profile obtained using HPA045005 was the top and single hit out of 7258 protein profiles using a threshold of adjusted P-value below 0.01. In other words, only the profile of HRG was statistically significantly associated with age in the screening sample set (N = 156). The results of all protein profiles were attached as Supporting Table 1. Phrases about the alpha level were added to the text to make the threshold clear. Because antibody validation of these exploratory studies requires enormous efforts and time, we could not choose a more liberal and inclusive threshold.

      For mortality outcome, it is not clear which class of disease is most strongly associated with increased risk of mortality from elevated HRG levels. If cause-specific mortality exists among the cohorts, could authors provide a more exact breakdown of the type of associated mortality by a disease class?

      We thank the reviewer for the question and have now added cause-specific data in the manuscript. Using cause of death data, mortality risk by diseases in circulatory system were compared with the risk by neoplasm and others. ElevatedHPA045005-HRG profiles were found to associate with mortality risk by diseases of the circulatory system (HR = 1.46 per SD, P = 2.80 × 10‑4, ICD-10 code I00-I99). It was larger than the risk by malignant neoplasms (HR = 1.28 per SD, P = 1.73 × 10‑2, ICD-10 code C00-C97). We chose big categories as ICD-10 codes "I" and "C" because the number of events was too small to get enough power in the survival analysis.

      Page 4 Section 3 (Results)-

      The authors say "We found consistent age-associated trends with HPA045005 across all eight replication sets (Supporting Figure 3)". On examining the supporting figure we noticed that the slope for the set with the largest number of subjects (Set 3 with ~3000 people) is visually negligibly positive (showing weakest age associated trends with HPA045005). Some comments from the authors on why they think the largest data set showed the weakest association.

      The plot for each cohort (in Supporting Figure 3) had different ranges in the y-axes. To make those plots comparable, the ranges in the y-axes of the different panels in the figure were modified to be the same for all cohorts. In the new version of the plot, it is easier to notice that there in fact is an increasing trend of the profiles in set 3. As we briefly discussed in Discussion, weaker age-association of the sample set may be due to the set was near to a random sample of population in the age range. Set 1, however, had over-representation of older people by selecting equal number of people in every age-intervals.

      From Figure 2 C in the main manuscript one concludes that for HPA045005, binding for CC individuals is ~ 2 times higher than TT individuals. Is it possible the age association showing up for HPA045005 is primarily a function of changing/increase in allele frequency as a function of age?

      The authors could consider adding a clarifying plot of Age vs Allele frequency or adding an interaction term of Age and Allele Frequency in the regression and survival analysis to address this question.

      As suggested, we now added a test of age association, and average age was compared by genotype. The result was added in Supporting Table 3. The heterozygote (CT) group has slightly higher average age without statistical significance (ANOVA P = 0.096).

      It is interesting that the signals were significant with the HPA045005 antibody but not with the BSI037 antibody. This is in spite of the fact that the GWAS for BSI0137 signals had an even stronger hit to the same locus. Can the authors please comment on why the signals from HPA045005 and BSI0137 were not highly correlated with one another and why the better antibody could not replicate the survival analysis results?

      We thank the reviewer for the comments. We believe that our text about our findings were not clear enough, though it is a primary finding. We modified the main text to easily distinguish the HPA045005-derived profiles that were influenced by the 204th amino-acid of HRG protein, from the BSI0137-derived profiles influenced by the 493the amino-acid. The signals from those two antibodies were likely obtained by capturing different parts of HRG, which are schematically illustrated in Figure 2D. What we found is that only one binder's profiles, not the other's, had predictive power for mortality risk within about 8.5 years. That suggests some age-dependent changes around the 204th residue of HRG reflected biological aging rather than whole protein level. To make our finding clearer, the two binders were compared in Table 2.

      **Minor Comments:**

      Figure 1: The authors description of the figure could use more clarification. "For each sample set, the estimated effect from the linear regression model.." estimated effect of what on what? On reading the main text one concludes it is the effect of age on HPA045005. This needs to be clarified in the label.

      We agree with the reviewer and have added these words.

      Figure 3: The X axis for the Kaplan Meir survival curve is labelled as Age. Survival is usually time to event and time is usually the follow up time. Further clarification for the choice of this label might be helpful.

      We clarified the choice of the time scale in the figure legend with a reference, where it was further discussed (Thiébaut & Bénichou, 2004). We chose age as the time scale, seeing age is the strongest risk factor for all-cause mortality, as the suggestion in the reference. We attempted to use follow-up time as the time scale with age adjustment before, which gave us almost the same results but violated the proportionality assumption of COX models.

      Figure 3: it would be good to include a table with the number of individuals at risk at the bottom of the plot at defined time intervals. The figure currently compares the bottom and top quartiles of HRP for visual assessment of mortality risk, it would also be informative to include middle quantiles.

      The figure was updated accordingly. The risk table was included and the results of the middle group were presented.

      Supporting Table 5: The note at the bottom of this table states "standardized HRG values by linear regression and scaling." What does standardization by linear regression mean?

      A sentence that explains the standardization was added in the footnote of the table.

      Supporting Table 5: It would be useful to understand that HRG carries additional risk beyond known Age and known clinical biomarkers listed in Table 2 (APOA1, APOB, TC, TG, Glucose, LDL). Could authors include a multivariate CoxPH regression with just Age? and with Age + clinical covariates?

      The impact of those clinical variables on survival models was examined and the results were added to Supporting Table 6 (which was Table S5). It turned out that the addition of those variables barely changed the results of the model for the HRG profile affected by 202th amino-acid.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      **Summary**

      The manuscript by Hong et al. describes the identification and validation of histidine-rich glycoprotein (HRG) as a marker of chronological age and all-cause mortality. HRG was determined using proteomics of serum and plasma samples in 9 different cohorts (total sample size ~4,100). The association with mortality was tested in the largest available cohort (TwinGene), comprising ~3,000 samples. The association with mortality seems to be stronger in women in comparison to men and could not be explained by CRP or diabetes-related traits. The HRG levels determined using an alternative antibody, BSI0137, did not show any association with mortality, indicating that the effect on mortality is likely isoform-dependent. The performed analyses seem to be statistically solid. However, the association with mortality still needs to be replicated in independent studies and the HRG measurement does not yet seem to be ready for standardized high-throughput measurement, which is necessary to make it usable as biomarker.

      **Major comments**

      • Although the authors have convincingly identified HRG to be associated with chronological age and mortality, it will require quite some additional work (including replication of the observed association with mortality in independent cohorts, testing the predictive ability, and making the measurement standardized and high-throughput) to prove its use as potential biomarker. At the moment, this is not at all discussed in the manuscript. Moreover, there have been some recent large-scale studies that identified biomarkers at the metabolic level that are not at all mentioned by the authors. The authors only refer once to the recent proteomic study by Lehallier in the Introduction, but do not at all discuss their findings in relation to this paper. Last but not least, HRG has already been associated with mortality in a previous study (https://www.ncbi.nlm.nih.gov/pubmed/29303798), but there is no mention of this anywhere in the manuscript. Hence, I think it would be good if the authors perform a thorough literature search to place their findings into context and rewrite their Discussion accordingly.

      We appreciate the reviewer's comments on the limitation of our paper. We are aware of the requirement of further investigation on HPA045005-HRG profiles as a biomarker to confirm it with independent cohorts. Instead, we supported our findings with a set of confirmatory analyses; we validated and annotated age-associated profile applying GWAS, sandwich assays, peptide arrays and mass spectrometry. Comparing two antibody profiles, we narrowed down to age-associated region within the protein HRG. The approach and finding, we believe, is novel.

      We added some discussion about recent large-scale proteomic studies such as Tanaka et al, 2018 and Lehallier et al, 2019. Unexpectedly, HRG was found not measured in those studies despite of the protein is one of the abundant proteins in blood (Poon et al, 2011). It may reflect challenges in assay development and missing piece in those large studies. The papers lack further investigation for molecular targets, which is common in proteomic papers, and makes it difficult to compare between studies and technologies. In that sense, our approach is different from other proteomic studies, because we invested time and efforts to investigate the molecular target.

      We are though thankful for the introduction of the suggested HRG publication, which we did not know about. We concluded that there are substantial differences in the subjects and suggested functions for the protein. Kuroda et al. found HRG as a biomarker for sepsis of ICU patients, while our study was done on the general population. They were measuring HRG protein level, whereas we found one particular region in HRG as a biomarker for all-cause mortality. Hence, we briefly discussed the reference in the paragraph about general information about HRG.

      • The authors need to add a Supplementary Table showing the association of all their 7,258 HPA antibodies with chronological age. Although I trust the authors, I can currently not tell if it is indeed correct that only one antibody was significantly associated with age in set 1.

      We agree with the reviewer. The table of association test results of all 7258 antibody profiles was attached to the paper as Supporting Table 1. We were also surprised that only one passed a conventional P-value threshold 0.01 after Bonferroni correction. It might be due to the low number of samples in the sample set 1 (N=156), compared to the number of antibodies or tests.

      • According to description in the Supporting Information, several samples in set 3-5 were overlapping with set 1 (45 in total). These samples should be removed from datasets 3-5 to make sure that there are no overlapping samples in the meta-analysis. However, I am not sure if the authors have actually done this. For the GWAS the overlapping samples from set 3 could still be included, given that set 1 is not involved in that. The authors could actually use these 45 overlapping samples to provide additional details about the reproducibility of HPA045005 between different measurements, for example by showing a correlation plot.

      We agree with the reviewer. Those 45 overlapping samples were excluded in the meta-analysis. As the reviewer's comment, only the data of sample set 3 was used for the GWAS.

      We also appreciate the comment regarding reproducibility and acknowledge that there are limitations to the technical performance of our exploratory SBA method. The procedure is tailored to handle large number of antibodies and profile 384 sample in the analysis plates. This setup allowed us to process relatively large number of samples per batch but it might be affected by batch effects. In our study set 3, there were 2999 samples randomized and analyzed in 8 different 384-well plates. The 44 overlapping samples between sets 1 and 3 were added to one of these 8 plates. This resulted in 1-11 samples to be analyzed on the same plate, hence, comparing these 44 with previous assays might be influenced if not dominated by plate effects. We went back to the initial data set generated during 2011/2012 and compared the first data with replicated assays using the same freeze-thawed samples. For HPA045005 we found the data to correlate by r=0.45. The next analyses of these 44 samples were conducted during 2015 using different sample aliquots and preparations as well as different SBAs. The correlation to previous assays was r

      • When looking at the effect of the rs9898-stratified analysis (Table S2) it seems that there only is an effect in the presence of the C-allele. Have the authors considered the presence of a potential recessive effect of this variant when looking at mortality?

      Average age of the individuals of each genotype of the SNP was compared and added into Supporting Table 3 (which was Table S2). No significant difference between the genotypes was found. As the reviewer noted, the mortality association of the HRG profiles affected by 204th amino-acid in the TT genotype group of rs9898 was milder and did not reach statistical significance. We believe that it is due to substantially smaller sample size and number of deaths in the genetic group. To clarify the difference in numbers, those numbers were added into the Supporting Table 3 (which was Table S2).

      • The authors need to discuss in more detail the implications of the difference between the two HRG antibodies in their association with mortality, for example in light of the use of HRG levels as a potential biomarker (i.e. how should one deal with the fact the way the levels are measured influences the outcome).

      We appreciated this valuable comment, which clearly reveals that our claim was not explained sufficiently. We modified the main text to distinguish those two antibody profiles more clearly. We also added Figure 2D and changed the structure of Table 2 to highlight the difference between the two antibody profiles.

      • Why did the authors put part of their Discussion in the Supplement? This is not common practice. They should either move it to the manuscript or remove it completely.

      We moved the discussion in the supplement to main text as the reviewer's suggestion.

      Reviewer #2 (Significance (Required)):

      The manuscript is clearly written and the analyses seem to be solid. However, although the findings described in the manuscript are interesting for the ageing field, they only provide a small step in the process of the usability of HRG as biomarker, i.e. many validation and follow-up studies will be necessary to prove its value. There have been some recent biomarker studies that have been much more advanced in this respect, which limits the novelty of this manuscript. I therefore feel that this manuscript may be best suitable for a medium-impact ageing-specific journal. My fields of expertise are ageing, genetics, and molecular epidemiology. Given my limited expertise when it comes to proteomics, I was not able to provide detailed comments on the methodology concerning this part.

      We thank the reviewer for the honest and constructive assessment of our work and agree with the suggestion to transfer this work to a medium-impact journal covering aspects of ageing research.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary

      The manuscript by Hong et al. describes the identification and validation of histidine-rich glycoprotein (HRG) as a marker of chronological age and all-cause mortality. HRG was determined using proteomics of serum and plasma samples in 9 different cohorts (total sample size ~4,100). The association with mortality was tested in the largest available cohort (TwinGene), comprising ~3,000 samples. The association with mortality seems to be stronger in women in comparison to men and could not be explained by CRP or diabetes-related traits. The HRG levels determined using an alternative antibody, BSI0137, did not show any association with mortality, indicating that the effect on mortality is likely isoform-dependent. The performed analyses seem to be statistically solid. However, the association with mortality still needs to be replicated in independent studies and the HRG measurement does not yet seem to be ready for standardized high-throughput measurement, which is necessary to make it usable as biomarker.

      Major comments

      • Although the authors have convincingly identified HRG to be associated with chronological age and mortality, it will require quite some additional work (including replication of the observed association with mortality in independent cohorts, testing the predictive ability, and making the measurement standardized and high-throughput) to prove its use as potential biomarker. At the moment, this is not at all discussed in the manuscript. Moreover, there have been some recent large-scale studies that identified biomarkers at the metabolic level that are not at all mentioned by the authors. The authors only refer once to the recent proteomic study by Lehallier in the Introduction, but do not at all discuss their findings in relation to this paper. Last but not least, HRG has already been associated with mortality in a previous study (https://www.ncbi.nlm.nih.gov/pubmed/29303798), but there is no mention of this anywhere in the manuscript. Hence, I think it would be good if the authors perform a thorough literature search to place their findings into context and rewrite their Discussion accordingly.

      • The authors need to add a Supplementary Table showing the association of all their 7,258 HPA antibodies with chronological age. Although I trust the authors, I can currently not tell if it is indeed correct that only one antibody was significantly associated with age in set 1.

      • According to description in the Supporting Information, several samples in set 3-5 were overlapping with set 1 (45 in total). These samples should be removed from datasets 3-5 to make sure that there are no overlapping samples in the meta-analysis. However, I am not sure if the authors have actually done this. For the GWAS the overlapping samples from set 3 could still be included, given that set 1 is not involved in that. The authors could actually use these 45 overlapping samples to provide additional details about the reproducibility of HPA045005 between different measurements, for example by showing a correlation plot.

      Minor comments

      • When looking at the effect of the rs9898-stratified analysis (Table S2) it seems that there only is an effect in the presence of the C-allele. Have the authors considered the presence of a potential recessive effect of this variant when looking at mortality?

      • The authors need to discuss in more detail the implications of the difference between the two HRG antibodies in their association with mortality, for example in light of the use of HRG levels as a potential biomarker (i.e. how should one deal with the fact the way the levels are measured influences the outcome).

      • Why did the authors put part of their Discussion in the Supplement? This is not common practice. They should either move it to the manuscript or remove it completely.

      Significance

      The manuscript is clearly written and the analyses seem to be solid. However, although the findings described in the manuscript are interesting for the ageing field, they only provide a small step in the process of the usability of HRG as biomarker, i.e. many validation and follow-up studies will be necessary to prove its value. There have been some recent biomarker studies that have been much more advanced in this respect, which limits the novelty of this manuscript. I therefore feel that this manuscript may be best suitable for a medium-impact ageing-specific journal.

      My fields of expertise are ageing, genetics, and molecular epidemiology. Given my limited expertise when it comes to proteomics, I was not able to provide detailed comments on the methodology concerning this part.

      Joris Deelen

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary:

      The paper applied affinity based proteomics and antibody validation to choose and validate histidine-rich glycoprotein (HRG) as a protein/target of interest. Survival analysis techniques were used to show associations between this protein and certain biomarkers, age and all cause mortality.<br> These results and findings were used to conclude that HRG may serve as a molecular indicator of age and mortality risk.

      Major Comments:

      The authors of the paper start the paper with just one protein narrowed down ie. HRG. The rest of the paper uses affinity based proteomics, antibody validation, GWAS and survival analysis to validate this target and support their claim that HRG is an age associate protein linked to mortality and certain clinical outcomes. How did the authors conclude that HRG was the only target to explore further in this paper? What methods or analysis was done for this? What were the other proteins if any that showed up in these studies?

      For mortality outcome, it is not clear which class of disease is most strongly associated with increased risk of mortality from elevated HRG levels. If cause-specific mortality exists among the cohorts, could authors provide a more exact breakdown of the type of associated mortality by a disease class?

      Page 4 Section 3 (Results)-

      The authors say "We found consistent age-associated trends with HPA045005 across all eight replication sets (Supporting Figure 3)". On examining the supporting figure we noticed that the slope for the set with the largest number of subjects (Set 3 with ~3000 people) is visually negligibly positive (showing weakest age associated trends with HPA045005). Some comments from the authors on why they think the largest data set showed the weakest association.

      From Figure 2 C in the main manuscript one concludes that for HPA045005, binding for CC individuals is ~ 2 times higher than TT individuals. Is it possible the age association showing up for HPA045005 is primarily a function of changing/increase in allele frequency as a function of age? The authors could consider adding a clarifying plot of Age vs Allele frequency or adding an interaction term of Age and Allele Frequency in the regression and survival analysis to address this question.

      It is interesting that the signals were significant with the HPA045005 antibody but not with the BSI037 antibody. This is in spite of the fact that the GWAS for BSI0137 signals had an even stronger hit to the same locus. Can the authors please comment on why the signals from HPA045005 and BSI0137 were not highly correlated with one another and why the better antibody could not replicate the survival analysis results?

      Minor Comments:

      Figure 1: The authors description of the figure could use more clarification. "For each sample set, the estimated effect from the linear regression model.." estimated effect of what on what? On reading the main text one concludes it is the effect of age on HPA045005. This needs to be clarified in the label.

      Figure 3: The X axis for the Kaplan Meir survival curve is labelled as Age. Survival is usually time to event and time is usually the follow up time. Further clarification for the choice of this label might be helpful.

      Figure 3: it would be good to include a table with the number of individuals at risk at the bottom of the plot at defined time intervals. The figure currently compares the bottom and top quartiles of HRP for visual assessment of mortality risk, it would also be informative to include middle quantiles.

      Supporting Table 5: The note at the bottom of this table states "standardized HRG values by linear regression and scaling." What does standardization by linear regression mean?

      Supporting Table 5: It would be useful to understand that HRG carries additional risk beyond known Age and known clinical biomarkers listed in Table 2 (APOA1, APOB, TC, TG, Glucose, LDL). Could authors include a multivariate CoxPH regression with just Age? and with Age + clinical covariates?

      Significance

      The authors have identified a new biomarker for aging and mortality. Understanding the mechanism and pathways involved in HRG homeostasis and how aging causes dysregulation of this HRG could be a topic for further research. Overall, this pathway provides an opportunity of a new molecular target for aging-based drugs and research.

      This article should be of interest to researchers interested in the biology of aging and for researchers developing drugs to slow down the process of aging. In addition, it should be of interest to researchers studying the HRG as a biomarker (for example, in sepsis (https://ccforum.biomedcentral.com/articles/10.1186/s13054-018-2127-5, https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3437790).

      This paper was reviewed by 3 co-reviewers, a senior principal investigator with extensive bioinformatics, metabolomics/proteomics, epidemiological experience, a highly experienced computational biologist with a record of developing and applying methods in bioinformatics and computational biophysics and lastly an computational biologist with a background in applied mathematics and statistical analysis. All three scientists are interested in aging research and understanding how human physiology and biomarkers in specific, change as a function of age.