5,361 Matching Annotations
  1. May 2021
    1. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Summary

      Lysosomes play key roles in cellular homeostasis by functioning as a signaling hub for growth control and acting as a terminal catabolic station. Deregulation of lysosomes are now linked to multiple human diseases including cancer, neurodegeneration and etc. An emerging topic of interests in lysosomal biology is the regulation of lysosomal proteostasis and how it impacts the overall fitness and functionality of the lysosome per se. Zhang et al presents here a case study of quality control of lysosomal membrane proteins, with a focus on the turnover of a lysosomal anchor E3 ubiquitin ligase RNF152. They showed that RNF152 is rapidly degraded through an ESCRT-dependent fashion and that this mechanism is also conserved in yeast.

      Major comments:

      1. The writing of the manuscript including the abstract could be further polished. The manuscript in its present form appears to be a technical report that does not sufficiently convey the significance of this study.
      2. Cyclohexamide is commonly used in studying the half-lives of proteins of interests. This is not a new method authors developed in the first place.
      3. The data of protein turnover was presented by plotting the relative level of proteins as a function of time. But the use of degradation kinetics was all over the place in the manuscript, which is inappropriate scientifically. The authors should first generate fit to first order decay to acquire a degradation rate constant, k (min-1) and calculate half-life (T1/2) from there.
      4. What are the functional consequences of RNF152 degradation? What are the biological impacts at both lysosomal and cellular levels in RNF152-depleted cells?
      5. Given the rapid turnover of RNF152 at basal state, one can predict that this protein may become functionally important under specific circumstances, for example, certain stress. This aspect is worth exploring.
      6. The authors chose RNF152 over OCA2, a melanosome-specific protein. However, OCA2 was shown to colocalize with LAMP2 much better than RNF152.

      Minor comments:

      1. Mislabeling and typo errors detected in the text: a. Page 7 "As expected, the full-length GFP-RNF152 and other lysosomal proteins such as LAMP2 and cathepsin D (CTSD) were enriched by Lyso-IP. In contrast, PDI (ER), Golgin160 (Golgi), EEA1 (endosomes), and GAPDH (cytosol) were not enriched (Figure 2D)." - should be Figure 2E instead. b. Page 7 "Our result confirmed that the lysosome population of GFP-RNF152 is quickly turned over, while LAMP2 is very stable on the lysosome (Figure 2E)." - should be Figure 2F instead. c. Page 14 "knocking down either TSG101 or both TSG101 and RNF152 only had a minor impact on the degradation kinetics of GFP-RNF152 (Figure S3A-B)." - should be ALIX instead of RNF152.
      2. Stable cells expressing GFP-RNF152 or 3xFLAG-RNF152 were primarily used in this study. It will be useful to perform some experiments by examining the endogenous counterpart using antibodies against RNF152. For example, Figure 2D and 2E.
      3. For all the flow cytometry analysis, the value of GFP intensity in respective graphs should be indicated.
      4. Statistics analysis was not performed on Figure 5D.
      5. In Figure 6D and J, what are the reasons for the appearance of multiple peaks, particularly, by the red line?
      6. In Figure 3A, the question marks should be removed to avoid confusion. "Predicted" can be used instead if there is no direct evidence from mass spec analysis.
      7. In Figure 3C, the authors identified two mutants including KR and CS that are refractory to degradation. It will be more insightful by showing the ubiquitination of these two mutants as in Figure 3B.

      Significance

      Multiple mechanisms including ESCRT complex have been reported to regulate the quality control of lysosomes. Understanding the roles of each mechanisms and selection of their substrates in maintenance of lysosomal integrity is of great interest in cell biology. Zhang and colleagues showed a case study of RNF152, a substrate of ESCRT-dependent degradation, but did not further pursue the biological functions of RNF152. This somewhat limits the conceptual advance of the study.

    2. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      The authors do not wish to provide a response at this time.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      We would like to thank the editor for their consideration and the reviewers for their time and thoughtful comments. Below we have written a point-by-point response to their comments and concerns. The original comments are displayed in italic fonts, whereas our responses are in regular fonts for clarity.

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      The submitted manuscript 'Identification of phenotype-specific networks from paired gene expression-cell shape imaging data' of Barker et al. uses a convolute of different bioinformatics tools (see Key Resource Table) to analyze reported RNA sequencing data and to correlate derived pathways with imaging features of breast cancer cell lines based on specific pathway constructions. The thin red line of the data presentation in the manuscript is not obvious.

      \*Major concerns:***

      1.1 The main biological 'finding' of the study RAP1 'as a potential mediator between the sensing of mechanical stimuli and regulation of NFkB activity' is reported and therefore the assumption 'how exactly extra-cellular mechanical cues are sensed by the cell and passed on to NFkB in breast cancer is not understood' is misleading. Please review: https://www.nature.com/articles/ncb2080** (human breast cancers with NF-κB hyperactivity show elevated levels of cytoplasmic Rap1. Similar to inhibiting NF-κB, knockdown of Rap1 sensitizes breast cancer cells to apoptosis) https://pubmed.ncbi.nlm.nih.gov/17510404**/ (RAP1 is a crucial element in organizing acinar structure and inducing lumen formation), and https://pubmed.ncbi.nlm.nih.gov/21429211/**.

      R1.1 We thank the reviewer for pointing out these references. Teo et al. (which is cited in our manuscript) provides evidence that Rap1 regulates IKK and therefore NFkB in breast cancer, while Itoh et al. and McSherry et al. focus on Rap1’s ability to modulate migration and morphogenesis as do other similar papers cited in our manuscript. None of the papers show the significance of the Rap1-NFkB interaction in the explicit context of cell shape with Teo et al. only speculatively mentioning a potential relevance of Rap1/NFkB in migration (“Given that NF-κB is critical for [. . . ] stimulating invasion, our results document a clinical setting wherein Rap1-mediated regulation of NF-κB could be critical.”). We appreciate that the specific sentence the reviewer has drawn attention to is slightly misleading given Teo et al. and we will amend it in the revised manuscript. However, here, we use a novel methodology on a past dataset to link the concepts introduced by these 3 papers within the specific context of cellular morphology in breast cancer cells.

      Specifically, in this manuscript, we identify a Rap1 expression module correlated with cell shape and find that it is at the network confluence of transcription factors activated by cell shape. This, along with our findings of modulation of NFkB co-activators, as well as previous work showing that it is a key mechano-transductive transcription factor leads us to hypothesize that Rap1 mediates the regulation and mechano-sensing of cell shape via its interaction with NFkB.

      It is also important to note that, while we build on the findings of Teo et al., McSherry et al. and Itoh et al. relating NFkB, Rap1 and cell shape, we also use our method to focus on other proteins of interest. These are drawn attention to in the discussion, with the ARNT KO/TNFalpha module being the most highly correlated gene expression module with the morphological features. Also, the importance of transcriptional co-regulators of NFkB are illustrated in the network propagation, with both NR0B2 and PPARGC1A mentioned in the discussion. However, our analysis naturally concentrates on the node with the most apparent literature support, which as your reading suggests is Rap1. The significance of this manuscript is that it is an unbiased systems-based methodology used to link cell-shape with signaling, via transcription in a context-specific manner (i.e. in the context of breast cancer). This produces a phenotype-specific network that has allowed us to connect diverse mechanisms and hypotheses put forward by other authors and further our understanding of how signaling manages the sensing and regulation of cell shape in breast cancer. The methodology is also applicable to any paired transcriptomics/phenotype dataset.

      1.2 Besides, Fig. 2 and 3 are unrelated to this main statements.

      R1.2 Figure 2 shows the results of our morphological cluster identification and subsequent differential expression analysis. Since these were included as parts of the network, we included them to give the reader an idea of the components included by this step. Figure 3A shows the network that was generated by our pipeline which forms the basis of all subsequent biological exploration, and the discovery of Rap1 and nodes important for the regulation of cell shape. Figure 3B shows that no bias was introduced by using the specific algorithm for our network generation. As such Figures 2 and 3 are related to the generation of the cell shape-specific network that forms the basis of our study. We will amend the text and figure legends to clarify this point more carefully, and we will consider moving some of the panels to the supplementary materials to simplify the message.

      1.3. The spotted RAP1 (by TFs JARD2 and RUNX2) finding is not obvious without Fig. 4 results, a network propagation of functional TFs in differentially activated processes (basal vs. luminal) in the cell shape regulatory network. Please show that RAP1 could be not identified without the network based on TF and DEG only.

      R1.3 The Rap1 hypothesis is supported by both Figure S2E and Figure 4. Figure S2E shows that the Rap1 pathway-enriched gene expression module is the most differentially expressed module among those incorporated in our cell shape regulatory network. This suggests that this module is correlated to cell shape on a transcriptomic level, but does not necessarily mean anything within the context of intracellular signaling. Figure 4 shows that this gene expression module is at the confluence of activated transcription factors as specified by our constructed signaling network. This is an interesting finding as it implies (unlike Figure S2E) that the Rap1 gene expression module is relevant to intra-cellular signaling.

      While the Rap1 module is indeed differentially expressed and could in theory have been found just by the DE analysis as being important, the network approach enables us to integrate these modules of co-expressed genes within known signaling networks. This allows us to go further than just making comments about expression and transcription factor activity, to discussing how signaling networks interact with our identified gene expression modules. This in turn allows us to construct more sophisticated hypotheses about cell shape regulation.

      Particularly, we use this analysis to reinforce the association with Rap1 by illustrating that the Rap1 network node lies at the confluence of transcription factors activated in luminal-like and basal-like cell shapes. We also use the network to identify highly central nodes (such as PPARGC1A, CTNNB1 and ESR1) and other proteins identified in the network propagation (YAP1, IKBKB and ARNT). Furthermore, the network is used as a means of integrating gene expression modules in their signaling network environment. The method by which this embedding was done (in-going edges being transcription factors regulating the module and out-going edges being signaling proteins contained within the module) adds context specificity to a network that is otherwise generalised to many cell-types and contexts.

      The lack of clarity on how we arrived at Rap1 as a key tenet of our discussion, as well as the added value to the methodology of the network analysis is something that we will certainly work on in our revision and we thank the reviewer for their valuable feedback. We will also move Figure S2E from the supplementary figures to the main Figure panels, as it is an important part of how we arrived upon Rap1 as a module of particular interest.

      1.4 More complex fluorescence phenotypes are available and do not match the complexity of the RNASeq data, data input and pathway construction with only 10 simple cell shape features. Conversely, relative 'monoclonal' breast cancer cell lines may are the only application for this workflow.

      R1.4 We thank the reviewer for their comment and respectfully disagree. These cell shape features were sufficient for the original authors Sero et al. to predict TF activities (PMID: 26148352) and Sailem et al. to identify clinically predictive metagenes (PMID: 27864353). Although these features seem simplistic, they concisely summarise a highly complex phenotype and are proven to encode metastatic potential (PMCID: PMC6976289) and to be prognostic markers for breast cancer progression (PMID: 28977854). Accordingly, these features are re-analysed using our workflow to better understand the signaling that drives them. Understanding other features that might match the complexity of our expression data is possible using the presented method, but is outside of the scope of the research within this manuscript as our research question focuses on the regulation of cell shape in breast cancer.

      As evidence that this cell shape network is applicable beyond cell lines, we can perform an analysis on breast cancer patient data from the TCGA, demonstrating the relationship of our network’s components with metastasis, which is highly related to cell shape.

      1.5 Image features Fig. 1 and 5 do not match

      R1.5 It is true that the features in Fig.1 and Fig.5 do not exactly match. This is because these two datasets came from different studies. While they are slightly different features, the essential phenotypes that they quantify are the same. For example, in Fig 5., cytoplasm area, cytoplasm perimeter, nucleus area, nucleus length, nucleus width and nucleus perimeter, are clearly basic morphological features that are analogous to features in Fig.1, such as cell area, cell width to length, nucleus area, nucleus width to length. It is a solid assumption that the biological processes that drive these features are common, and our results illustrate this. Meanwhile, the existence of features measured in the dataset in Fig.5, that are not analogous to those used in the dataset used for model construction, provide convenient negative controls in order to determine that our network describes regulatory processes specific to the features used in network construction.

      1.6 with Fig. 5 being a rather indirect 'proof' of usability.

      R1.6 The purpose of Figure 5 is not to demonstrate usability but to prove that the network that we have identified is indeed representing cell signalling that controls cell shape. It shows that perturbations inside our network have a significantly stronger effect on phenotypes used in the creation of the network than perturbations outside our network. Moreover, it shows that this is not the case for phenotype features not used in the creation of our network, underlining that our network is both accurate and specific to the phenotypes used as input. We will add further clarifications in the text and figure legend to explain this better.

      1.7 Fig. 1a has not achieved a visual descriptive state and asking a lot.

      R1.7 We apologize for the lack of clarity here and will revise the figure to better present the method and reduce confusion.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      The authors combine single cell morphology and gene expression data to identify signaling activities implicated in the control of cellular morphogenesis. They describe a reasonable bioinformatics pipeline from gene expression shifts between two morphological phenotypes to pathways, then to common transcription factors to signaling. As far as I can assess the situation (I am not familiar with all the tools they use) the proposed pipeline works convincingly.

      We would like to thank their reviewer for their thoughtful comments and to clarify that the analysis has not been done on single cell gene expression data, but rather on bulk RNAseq data. This is further explained in the point-by-point responses below.

      2.1 However, I am concerned that the logic underlying this analysis is only partially valid. The link between signaling and morphology may be more direct than via TF-based gene expression regulation. Many signals (and many of the kinases the authors test for validation) are implicated in morphology control as direct upstream regulators of cytoskeleton dynamics and adhesion. This also applies to the GTPase Rap1, which the authors fish out as the most differentially expressed signal between two types of morphologies. In addition to the indirect effect of Rap1 on morphology via NFKB regulation suggested by the authors, Rap1 will affect morphology probably very directly through activation of Rac -> F-actin and RIAM -> nascent adhesions. At minimum, the authors should discuss this complexity as a caveat of their approach. And dependent on the impact the authors hope to have with this story, I believe they should experimentally resolve the ambiguity of direct vs indirect signaling for some of their key interpretations.

      R2.1 We thank the reviewer for making this fair point about the limitations of our data, i.e that we are not able to directly observe and delineate exactly how upstream signaling has modulated or is modulated by cell shape. As a first major clarification, which we will make sure to include in the text, the Rap1 node doesn’t necessarily represent the Rap1 GTPase itself. It is a co-expression module that is enriched in activators and downstream effectors of Rap1 signalling. As such when we are talking about the Rap1 module we mean the subnetwork of Rap1 signalling rather than the specific small GTPase itself. Thus, we can’t assume that Rap1 itself is the key node in this subnetwork and it is therefore complicated to design specific experiments to test the direct or indirect modulation of cell shape by Rap1. We agree however that additional information regarding the role of this module in regulating cell shape would be interesting and valuable.

      For this study, we have access to transcriptomic and cell shape data from 14 cell lines with transcriptomic and cell shape data. Using the expression data, we will quantify Rap1 expression module activity and its relationship to NFkB transcriptional activity across these cell lines. By comparing in these cell lines the effect on morphology when NFkB and the Rap1 module are combinatorially activated or deactivated, we can disseminate between competing hypotheses for direct and indirect activities of these two factors on cell shape. For example, if the overwhelming source of Rap1 module’s function was via direct interaction with F-actin, then Rap1 module activity would be predictive of cell shape, regardless of NFkB activation. A caveat to this is our limited access to only 14 cell lines. Additionally, if necessary, we have access to drugs that can induce cytoskeletal defects and perturb morphology directly. This can be used to disrupt the relationship between Rap1 and F-actin that the reviewer has identified and gauge the effect on cell shape. If such an intervention disrupts the relationship between Rap1 signaling module/NFkB transcriptional activity and cell shape then we can hypothesise that the activity of Rap1 signaling is greater than just its direct activity on F-actin. Finally, we can perform a knock-down of Rap1 (or selected components of its module) and NFkB itself and gauge the effect of such a perturbation on the Rap1 gene expression module and cell shape.

      That being said, these signaling modulations (whether indirect or direct) are reflected accordingly in differentially activated transcription factors, and therefore can be observed and recorded from expression data. This is an interesting finding, as it implies that signaling processes not explicitly making use of transcription factors (such as those that directly affect adhesion complexes, regulating cytoskeletal proteins etc) can still have their activity gauged through their indirect downstream expression signatures. In any case, our findings illustrate that there is a cell shape-specific modulation of the Rap1 module in breast cancer, reflected in the expression data. Rap1 almost certainly has some direct contributions to cytoskeletal dynamics in breast cancer (PMID: 10805781, PMID: 30156466 and PMID: 22644079), but here we observe clearly how it also is modulating transcription factors, that we hypothesise may contribute to the development of a morphological and transcriptomic ‘niche’ in a more robust and long-term fashion. Nonetheless, the points discussed by the reviewer are valuable and in our revision we will discuss this as a potential caveat.

      In defense to the presented premise, the authors start out by looking for correlation between gene expression and morphology, and they find some signal. Correlation analysis, especially in large data sets, tends to be pretty robust and specific, even on presence of strong confounders. Thus, even though the correlation expression-morphology, which points indirectly at morphology-regulating signaling modules, is likely to be super-imposed by direct morphology-regulating signaling pathways the proposed approach will not be able to detect, the presented analysis is valuable, in principle.

      We thank the reviewer for their positive comments.

      That said, I have a number of substantial concerns also with the implementation and presentation of the approach.

      2.2 First, on the presentation side, for a paper that talks about cell morphology it is strange to have not a single figure panel showing an image of cells, or at least cell outlines. As a reader I would like to get visual impression of how different a high vs low Rap1 gene expresser is, for example.

      R2.2 - We agree that it would greatly help the clarity and message of the manuscript. As we are not able to use the public and previously published data that have been used for our paper due to copyright laws associated with journal publications, we will generate relevant images representative of the respective cell shapes and include them in the manuscript.

      2.3 Along the same lines, it is not quite clear to me when the authors collate entire cell lines into a single phenotype, do they switch then to population-based analysis? That is, for example the volcano plots in 2B,C are they representing an average gene expression shift?

      R2.3 - We apologise for the lack of clarity. The morphological clustering and differential expression illustrated in Figure 2 is to find expression signatures responsible for distinct breast cancer cell shapes. We link our expression data and our imaging data via the breast cancer cell lines and so are limited to studying expression in bulk per cell lines. The volcano plots in 2B,C are of the differentially expressed genes (as calculated by DESEQ2) in morphologically distinct clusters of cell lines as specified by Figure 2A. This was collated so we could observe transcriptomic differences accounting for cellular morphology, rather than differences in cell lines (these being already well characterised and not in the scope of this manuscript).

      We will add additional clarifications in our revised manuscript to further explain this.

      2.4 How heterogeneous are the morphological signals?

      R2.4 We provide values of standard deviation for the morphological features of the derived clusters (page 4 - “Clustering based on morphology reveals distinctive cell-line shapes”). The heterogeneity of the morphological clusters is minimised as per the elbow plot shown in Figure S2A. This plot illustrates the decreasing total within-cluster variation of the cell shape groups as the number of clusters (k) is increased. The point of inflection represents the optimum number of clusters (in our case k=3). Aside from this, we note that one of those 3 clusters is significantly more heterogeneous both morphologically speaking and biologically speaking (illustrated Figure 2A) and so we used the other two which showed more informative gene expression profiles and could be annotated roughly with breast cancer subtypes (basal and luminal - although is alignment was not perfect since the grouping was based only on the morphology). We took the more heterogeneous cluster (cluster A, Figure 2A) to be the least relevant cluster in terms of morphology and biologically significance, also because it contained the non-tumorigenic cell-line MCF10A.

      2.5 Are the correlations between gene expression and morphology computed with single cell data as the basis?

      R2.5 Due to the availability only of bulk RNA sequencing data for the cell lines for which we also had high content imaging data, all analyses are done using bulk RNA sequencing data at the cell line level. We will clarify this in the text and methods section.

      2.6 Could the volcano plots be sharpened by accounting for the single cell variation in morphology instead of lumping the cells into two morphological classes?

      R2.6 As per the limitations mentioned in R2.3 and R2.5, we cannot study gene expression at the single cell level. Furthermore, the utility of using morphological clusters was so that we could observe morphological transcriptomic traits rather than those specific to cell lines.

      2.7 On the back end of the paper, when the authors apply kinase inhibitors to validate some of the claimed pathways, it would be nice for the reader to see the morphological effects of these inhibitors. And to relate the kinase induced shifts to the morphological heterogeneity that is the basis for the study driving, initial correlation analysis? At the end of the day, the proof is in the pudding.

      R2.7 - We thank the reviewer for his very good point, and it would certainly improve the manuscript. We will attempt to source a visual illustration of the effect of kinase inhibitors on the breast cancer cells. However, the dataset we source this data from is publicly available, but unpublished and so our use is constrained by the terms of use of LINCs. We will contact the LINCS consortium to acquire permission and if they allow, we will certainly include them in our revised manuscript.

      2.8 Finally, cell morphology regulation is a pretty foundational process of life. One therefore wonders whether the pathways the authors pulled out of their analysis work also in other cell types, beyond breast cancer cells? What if they pooled data from different cell types that cover the morphological state space more broadly?

      R2.8 We thank the reviewer for this interesting point. We have observed that many if not most of the general processes identified, i.e. developmental pathways, extracellular matrix regulatory pathways and adhesion pathways, are already known to be associated with cell shape regulation and mechanotransduction in many different cell types. Thus, at the ‘big picture’ level, our findings hold across multiple cell types. However, the precise wiring of our network seems to be breast cancer specific. This is evidenced by the fact that when we try to use the LINCS data from other cell types to see if our network still holds in these contexts, we do not observe significant increase in changes in morphology when perturbing central nodes within our network compared to outside of our network. This is not unexpected: depending on the individual molecular background of each tissue and tumour type, signalling networks are known to be wired differently. Indeed, our method that uses the context- and cell feature- specific gene expression modules and the transcription factors that regulate them as a basis for extracting our cell shape signalling network allow for identification of exactly its specific wiring in the context used for training, i.e. the breast cancer cells. It would be very interesting to repeat our analysis on similar data on a different tissue type to identify parts of the network that seem to be identical versus those that differ. We have not been able yet to identify public systematic high content imaging data for a different tissue across multiple cell lines, but we will continue to look for such a dataset in the literature and through our network of collaborators. We will also explore the possibility of extracting such information from images from the TCGA to perform this analysis across patients of a specific cancer type, although admittedly we are not sure how feasible it would be to extract analogous features as the ones used for the breast cancer network from the images available. We will also add this point to the discussion.

      Reviewer #2 (Significance):

      The premise of this manuscript is very exciting and interesting: Is it possible to identify from a correlation of cell morphology and single cell gene expression the underlying cell signaling states that control morphology? Answers to this will begin to shed some light on the black box relation of morphology as an informant of cell states, which has been exploited by pathologists, physiologists, and cell biologists for more than a century, and which has seen a sharp revival recently thanks to deep learning, which is exceedingly good at finding correlations between data patterns.

      This paper would have a broad audience in quantitative cell biology, systems biology, and perhaps also life data science and cancer (although the cancer aspect is marginal).

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      \*Summary***

      In this work Barker et al. used computational approaches to analyze several existing data sets (including morphology and expression) in a common context of signaling-regulatory network that correlates with cell morphological features. They identified several pathways and associated transcription factors that their expression levels correlate with specific cell morphological features. The work thus has two main contributions. First, it provides a network of signaling pathways and regulons that may affect the morphological features of breast cancer cells. Second, the computational procedure can be general to study other systems.

      \*Main comments***

      3.1 I assume that in all analyses using the packages listed in the manuscript, some parameters need to be selected. The authors need to provide these details, and discuss whether the results are robust against parameter choice (at least to certain degree).

      R3.1 We thank the reviewer for this comment. Parameters need to be selected in WGCNA and PCSF. In WGCNA they are selected based on the guidance given by the authors of the package, little significant variation in the results was observed when these were changed (i.e. the make-up of the gene expression modules naturally changed, but the processes enriched in the modules correlated to cell shape were the same).

      PCSF is a more sensitive step because the best solutions to the sub-network identification problem are observed when the network edge weights are permuted over a number of iterations. Following this, the union of the produced array of networks is taken to be the solution. Obviously, biology is an inherently noisy system and so this formulation of the PCSF algorithm can capture latent network architecture that the deterministic variation cannot. This introduces extra parameters based on the requirement to introduce random noise to the network, along with the standard PCSF parameters (seen here: https://rdrr.io/github/IOR-Bioinformatics/PCSF/man/PCSF_rand.html) that are used to take into account of user variations in network degree distribution, edge-weight distribution, etc. It is normal for some tuning of parameters to be required for users to tailor their PCSF to their supplied network. We used degree distribution to gauge whether our network appeared to be of a biological ‘scale-free’ distribution and selected parameters based on that. This provides an affirmation that our resulting network is consistent with how we understand the topology of biological networks, and as a result the parameters selected are not arbitrary.

      Nonetheless, we also tested variations in these parameters and found that although levels of significance in our validation would vary, the trends apparent from our validation did not (i.e., that targeting kinases within our network produced a larger effect on cell shape than those outside). From this we were assured that our conclusions mentioned in the discussion were robust to parameter selection. All details of parameters used are currently in our gitlab page, but we will additionally include them in the methods section.

      3.2 Cells show some degree of heterogeneity both in cell expression and morphological features, which can be affected by many factors. Wu et al. (Sci Adv 2020, 6 (4): eaaw6938) identified several subgroups of MD-MBA-231 cells with persistent (over generations) distinct morphology, expression profiles, and metastasis potential. Another possible main factor to cell morphology heterogeneity is cell cycle stage. I understand that the analyses in this work are limited by the types of data available, for example, the expression data are largely bulk. One exception might be the data shown in Fig 5. Besides giving the fold changes after kinase inhibitor treatment, the authors may also analyze the variance of cells before and after treatment to estimate the relative extent of cell-cell heterogeneity relative to the effect due to treatment.

      R3.2 We thank the reviewer for the comment and suggestion. We will perform such an analysis and include this point in the discussion. However, to reassure the reviewer, we are not concerned about this affecting our analysis in a major way as, in data from high content microscopy experiments such as the ones we used, hundreds of cells are sampled and the resulting quantified phenotypes are represented by the average from single cells, after removing outliers. Similarly in the bulk RNAseq experiments the dominant cell phenotype/expression profile would be mainly represented in the data. We are therefore reasonably confident that both sets of data used from each cell line indeed represented the most common phenotype for that cell line.

      3.3. As related to point 2, In Fig. 1B, I am surprised that cell cycle only correlates with cell area significantly, while one knows that cells undergo dramatic change during cell cycle. For example, cells would turn to be roundish for mitosis. How would the authors explain the results? Is it possible that there is sampling bias towards interphase cells?

      R3.3 We thank the reviewer for this comment and apologise for the confusion. The y axis of Fig.1B relates to gene expression modules identified in the expression data. These were named based on any informative term that could be associated with the genes within the modules as implemented by gene set enrichment. The goal was to provide more informative names than the default module names that are based on colours. ‘Cell cycle’ as a term in Reactome is a particularly generalisable gene set and was applied to the gene expression module in question because it was the only informative term identified for it. This singular gene expression module does not represent all transcriptomic activity associated with the cell cycle process. Indeed, the term ‘Cell cycle’ was also enriched in the ‘Hedgehog off-state’ gene expression module (Supplementary table 5). As the enrichment is based only on the genes: HAUS8;MCM8;NCAPH2;MIS12;BIRC5;CENPM;SPDL1;FBXO5;TYMS;TUBB4A, which are not necessarily the major cell cycle-relevant genes, we agree that the name of the specific module is not ideal and can cause confusion so we will rename this module. We will also go over the naming of all the modules to ensure that the names are indeed representative of the module functions.

      \*Minor comments***

      3.4 In Fig 1B, I have trouble to understand the biological relevance of some module names, like "Green", "indianred4"?

      R3.4 Our pipeline uses WGCNA which constructs gene expression modules completely from gene expression data. We named modules based on terms we could find associated with the genes within a module. Some modules did not have any informative terms associated with them and so we opted to keep the default name of those modules that WGCNA supplies (based on colours). We will attempt to make this clearer in our revised manuscript, by adding a better explanation, and renaming these modules to something that makes it clear that we could not assign a clear function such as non-annotated (NA) module 1,2,3 etc.

      3.5 Fig 3B: I can't find a detailed explanation on how the combined score was calculated.

      R3.5 Thank you for pointing this out. This is described in Chen et al. 2013 as part of the enrichR package for gene set enrichment analysis. We will add this detail in the methods section under “Quantification and Statistical Analysis“.

      3.6 Some of the cell features in Fig 5A are not in Fig 1. Are they from the same analysis? Any explanation?

      R3.6 As the two datasets were acquired in two completely different studies there isn’t a 1-1 correspondence of the phenotype features, however several of them essentially represent the same phenotype. For example, in Fig 5., cytoplasm area, cytoplasm perimeter, nucleus area, nucleus length, nucleus width and nucleus perimeter, are analogous to features in Fig.1, such as cell area, cell width to length, nucleus area, nucleus width to length. The intersection between the features in these two datasets is not exact however, and we use the features in Fig.5 not used in the network construction as a negative control. This allows us to show that our network is phenotype-specific to the morphology features it was trained on. We will clarify this in the manuscript.

      3.7) It is interesting that the authors have identified a number of pathways known to be related to mechanosensing. Does the Hippo-YAP/TAZ pathway appear in their analysis?

      R3.7 Yes, YAP1 is also significantly highly ranked in our network propagation of activated transcription factors in Fig.4 in both luminal- and basal- shaped cell lines. Furthermore, since submitting we have been experimenting with identifying subnetworks of our regulatory network using maximum-flow. Here we assess the interaction between the Rap1 module (given its centrality to our discussion) with NFkB and what is the most efficient ‘flow’ of information between these two nodes given our network. To our interest, we identify LATS2, DVL1 (genes within the Rap1 module), YAP1 and TAZ as key mediating factors between these nodes. This implicates the Hippo-YAP/TAZ pathway as being of particular importance in the interface of our identified gene expression module and our derived signalling network. As an illustration of this we include here one of the preliminary networks derived from this analysis.

      Figure - Network describing maximum flow (top 20 edges) between Rap1 signaling module as the source node and NFKB1 as the target node. The super-node representing the gene expression module is coloured red (with the interface nodes used to embed it in the signaling network coloured blue) and NFKB1 coloured azure. Edge thickness indicates weight, which was used as the maximum capacity used in the max flow calculation.

      We will go over the figures and move barplots and more technical information to the supplement to make space for more figures of this nature that better illustrate the processes involved in the regulation of cell shape as derived from our analysis.

      Reviewer #3 (Significance):

      Extensive studies link cell morphological features and cell expression states, with some important work from one of the authors (Chris Bakal). This topic gains further interest recently. For example, Wu et al. (Sci Adv 2020, 6 (4): eaaw6938) demonstrated that cell shape encodes metastasis potential. Wang et al. (Sci. Adv. 2020 6 (36): eaba9319) traced cell dynamics in cell feature space. Given the context, the work of Barker et al. is a timely study to establish a cell shape-signaling network from integrated analysis of several different types of data. While some previous studies have related the NFkB network and cell morphology, this work further provides unbiased analysis on the relation between cell morphological features and multiple pathways/transcription factors. It is interesting, for example, that the study identifies the correlation between the Rap1 pathway and cell morphology, which was not well studied previously. As the authors acknowledged, there are some limitations of their approach. For example, the relations identified are correlative instead of causal without further verification. The resultant network may have sampling bias. Despite these limitations, I suggest that this work will be a nice contribution to the field and can provide a basis for further studies.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Summary

      In this work Barker et al. used computational approaches to analyze several existing data sets (including morphology and expression) in a common context of signaling-regulatory network that correlates with cell morphological features. They identified several pathways and associated transcription factors that their expression levels correlate with specific cell morphological features. The work thus has two main contributions. First, it provides a network of signaling pathways and regulons that may affect the morphological features of breast cancer cells. Second, the computational procedure can be general to study other systems.

      Main comments

      1) I assume that in all analyses using the packages listed in the manuscript, some parameters need to be selected. The authors need to provide these details, and discuss whether the results are robust against parameter choice (at least to certain degree).

      2) Cells show some degree of heterogeneity both in cell expression and morphological features, which can be affected by many factors. Wu et al. (Sci Adv 2020, 6 (4): eaaw6938) identified several subgroups of MD-MBA-231 cells with persistent (over generations) distinct morphology, expression profiles, and metastasis potential. Another possible main factor to cell morphology heterogeneity is cell cycle stage. I understand that the analyses in this work are limited by the types of data available, for example, the expression data are largely bulk. One exception might be the data shown in Fig 5. Besides giving the fold changes after kinase inhibitor treatment, the authors may also analyze the variance of cells before and after treatment to estimate the relative extent of cell-cell heterogeneity relative to the effect due to treatment.

      3) As related to point 2, In Fig. 1B, I am surprised that cell cycle only correlates with cell area significantly, while one knows that cells undergo dramatic change during cell cycle. For example, cells would turn to be roundish for mitosis. How would the authors explain the results? Is it possible that there is sampling bias towards interphase cells?

      Minor comments

      4) In Fig 1B, I have trouble to understand the biological relevance of some module names, like "Green", "indianred4"?

      5) Fig 3B: I can't find detailed explanation on how the combined score was calculated.

      6) Some of the cell features in Fig 5A are not in Fig 1. Are they from the same analysis? Any explanation?

      7) It is interesting that the authors have identified a number of pathways known to be related to mechanosensing. Does the Hippo-YAP/TAZ pathway appear in their analysis?

      Significance

      Extensive studies link cell morphological features and cell expression states, with some important work from one of the authors (Chris Bakal). This topic gains further interest recently. For example, Wu et al. (Sci Adv 2020, 6 (4): eaaw6938) demonstrated that cell shape encodes metastasis potential. Wang et al. (Sci. Adv. 2020 6 (36): eaba9319) traced cell dynamics in cell feature space. Given the context, the work of Barker et al. is a timely study to establish a cell shape-signaling network from integrated analysis of several different types of data. While some previous studies have related the NFkB network and cell morphology, this work further provides unbiased analysis on the relation between cell morphological features and multiple pathways/transcription factors. It is interesting, for example, that the study identifies the correlation between the Rap1 pathway and cell morphology, which was not well studied previously. As the authors acknowledged, there are some limitations of their approach. For example, the relations identified are correlative instead of causal without further verification. The resultant network may have sampling bias. Despite these limitations, I suggest that this work will be a nice contribution to the field and can provide a basis for further studies.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      The authors combine single cell morphology and gene expression data to identify signaling activities implicated in the control of cellular morphogenesis. They describe a reasonable bioinformatics pipeline from gene expression shifts between two morphological phenotypes to pathways, then to common transcription factors to signaling. As far as I can assess the situation (I am not familiar with all the tools they use) the proposed pipeline works convincingly. However, I am concerned that the logic underlying this analysis is only partially valid. The link between signaling and morphology may be more direct than via TF-based gene expression regulation. Many signals (and many of the kinases the authors test for validation) are implicated in morphology control as direct upstream regulators of cytoskeleton dynamics and adhesion. This also applies to the GTPase Rap1, which the authors fish out as the most differentially expressed signal between two types of morphologies. In addition to the indirect effect of Rap1 on morphology via NFKB regulation suggested by the authors, Rap1 will affect morphology probably very directly through activation of Rac -> F-actin and RIAM -> nascent adhesions. At minimum, the authors should discuss this complexity as a caveat of their approach. And dependent on the impact the authors hope to have with this story, I believe they should experimentally resolve the ambiguity of direct vs indirect signaling for some of their key interpretations.

      In defense to the presented premise, the authors start out by looking for correlation between gene expression and morphology, and they find some signal. Correlation analysis, especially in large data sets, tends to be pretty robust and specific, even on presence of strong confounders. Thus, even though the correlation expression-morphology, which points indirectly at morphology-regulating signaling modules, is likely to be super-imposed by direct morphology-regulating signaling pathways the proposed approach will not be able to detect, the presented analysis is valuable, in principle.

      That said, I have a number of substantial concerns also with the implementation and presentation of the approach. First, on the presentation side, for a paper that talks about cell morphology it is strange to have not a single figure panel showing an image of cells, or at least cell outlines. As a reader I would like to get visual impression of how different a high vs low Rap1 gene expresser is, for example. Along the same lines, it is not quite clear to me when the authors collate entire cell lines into a single phenotype, do they switch then to population-based analysis? That is, for example the volcano plots in 2B,C are they representing an average gene expression shift? How heterogeneous are the morphological signals? Are the correlations between gene expression and morphology computed with single cell data as the basis? Could the volcano plots be sharpened by accounting for the single cell variation in morphology instead of lumping the cells into two morphological classes? On the back end of the paper, when the authors apply kinase inhibitors to validate some of the claimed pathways, it would be nice for the reader to see the morphological effects of these inhibitors. And to relate the kinase induced shifts to the morphological heterogeneity that is the basis for the study driving, initial correlation analysis? At the end of the day, the proof is in the pudding.

      Finally, cell morphology regulation is a pretty foundational process of life. One therefore wonders whether the pathways the authors pulled out of their analysis work also in other cell types, beyond breast cancer cells? What if they pooled data from different cell types that cover the morphological state space more broadly?

      Significance

      The premise of this manuscript is very exciting and interesting: Is it possible to identify from a correlation of cell morphology and single cell gene expression the underlying cell signaling states that control morphology? Answers to this will begin to shed some light on the black box relation of morphology as an informant of cell states, which has been exploited by pathologists, physiologists, and cell biologists for more than a century, and which has seen a sharp revival recently thanks to deep learning, which is exceedingly good at finding correlations between data patterns.

      This paper would have a broad audience in quantitative cell biology, systems biology, and perhaps also life data science and cancer (although the cancer aspect is marginal).

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      The submitted manuscript 'Identification of phenotype-specific networks from paired gene expression-cell shape imaging data' of Barker et al. uses a convolute of different bioinformatics tools (see Key Resource Table) to analyze reported RNA sequencing data and to correlate derived pathways with imaging features of breast cancer cell lines based on specific pathway constructions. The thin red line of the data presentation in the manuscript is not obvious.

      Major concerns:

      1. The main biological 'finding' of the study RAP1 'as a potential mediator between the sensing of mechanical stimuli and regulation of NFkB activity' is reported and therefore the assumption 'how exactly extra-cellular mechanical cues are sensed by the cell and passed on to NFkB in breast cancer is not understood' is misleading. Please review: https://www.nature.com/articles/ncb2080 (human breast cancers with NF-κB hyperactivity show elevated levels of cytoplasmic Rap1. Similar to inhibiting NF-κB, knockdown of Rap1 sensitizes breast cancer cells to apoptosis) https://pubmed.ncbi.nlm.nih.gov/17510404/ (RAP1 is a crucial element in organizing acinar structure and inducing lumen formation), and https://pubmed.ncbi.nlm.nih.gov/21429211/. Besides, Fig. 2 and 3 are unrelated to this main statements.
      2. The spotted RAP1 (by TFs JARD2 and RUNX2) finding is not obvious without Fig. 4 results, a network propagation of functional TFs in differentially activated processes (basal vs. luminal) in the cell shape regulatory network. Please show that RAP1 could be not identified without the network based on TF and DEG only.
      3. More complex fluorescence phenotypes are available and do not match the complexity of the RNASeq data, data input and pathway construction with only 10 simple cell shape features. Conversely, relative 'monoclonal' breast cancer cell lines may are the only application for this workflow. Image features Fig. 1 and 5 do not match, with Fig. 5 being a rather indirect 'proof' of usability.
      4. Fig. 1a has not achieved a visual descriptive state and asking a lot.

      Significance

      The 'Review Commons' efficiently facilitates the reviewing process for the corresponding journals due to the broader 'audience'. On the other hand, authors face less restrictions and pressure for the same reason. Although I really like the idea of pulling the reviews upstream into the preprint process, I would like to answer here also with a kind of pre-review to avoid entering partly immature manuscript to 'Review Commons'. Review Commons might install an automated 'sanity check' of manuscripts in the future to keep the quality of submissions higher?

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Review Commons Reviews for Refereed Preprint RC-2021-00693

      Ferrari G. et al., DLL4 and PDGF-BB regulate migration of human iPSC-derived skeletal myogenic progenitors.

      Reviewer #1 (Evidence, reproducibility and clarity): Summary:

      The paper presented by Ferrari et al., aims to improve the migration capacity of hiPSC- derived myogenic progenitors. For this purpose, the authors used a previously published well characterized hiMPs model and focussed on the modulation of NOTCH and PDGF signaling pathways. The rational to target these pathways was based on muscle cells migrations molecular events observed during developmental described in the literature.

      Major comments: Are the key conclusions convincing?

      This is a very interesting paper. Few clarifications as suggested below need to be done before being fully convincing. Enrichment test and heat maps and the network analysis are not well explained in terms of which genes were selected and why, and in terms of which gene set were selected and why. In some cases, the information may be given in the paper, but it is not easy for the reader to find it. It should be stated more clearly. For example, in Fig2C why these eight were chosen for the heat maps and why not other genes known to be involved in myogenesis, cell migration etc. Similar comment for figure 3 A, D and G. Another example, in Fig 2E, on what basis are some gene sets chosen to be shown in this figure when there are many more significant in the supplementary table.

      We thank the Reviewer for their positive feedback and for this comment. Although some answers to the queries could be found within the figure legends, we agree that figures could have been more self-explanatory, and we will amend them accordingly. We will also add additional information into the main text to clarify those specific points.

      In response to the specific queries:

      • All enrichment heat maps were generated from GO lists or KEGG pathways.
      • 2C: these were chosen instead of other myogenic or cell migration markers for consistency with our previous study (Figure 2C in Gerli et al Stem Cell Reports 2019).
      • 3A, D, G: details of the GO lists used to generate heat maps were available in the relative figure legend.
      • 2E: enrichment pathways – we listed pathways shared between at least 2 of the three groups and with relevance to cellular migration.

        Figure 4F is impossible to interpret without a clear description of how the subnetwork is extracted, was a list of gene list submitted to string, if so which genes and why? Secondly, why are there many nodes with no edges? Is it all of the nodes that are in that GO-Term, if so it needs to be clarified? Was this the most strongly deregulated go-Term according to string analysis?

      We thank the Reviewer for this comment. This specific GO list was selected for its highly relevant title/topic, i.e.: “positive regulation of cell migration”. Details on this point could also be found in the specific figure legend, where we specified how the network is extracted and constructed. There are several nodes with no edges as the edges represent predicted functional association and therefore, a lack of edges suggests a lack of interaction.

      Figure 4 B, C, D and E: (1) The authors should clarify what figure 4B is? Is 1,2,3,4 different time point? Treated or untreated cells?

      We apologise with the Reviewer for not having provided enough information on this point. 1,2,3 and 4 are four sequential time points of untreated cells. We will amend the figure to make this clearer.

      (2) Figure C: Is the graph showing the cell distribution of both treated and untreated cells? If yes is it possible to give a different shape for the control cells and see if indeed more control green shape would be observed in this plot? (In the supplementary data there is the distribution showing the treated v untreated, but the clusters are not visible)

      We thank the Reviewer for this helpful comment. We agree that this will increase the quality of the figure. We will distinguish treated and control cells within figure 4C by replacing dots with different shapes for treated and untreated samples.

      (3) Would it be possible to take some of the parameters in Figure 4D and show the distribution in treated vs untreated and perform the statistical analysis? (eg is there a significant difference for the parameter total distance between control and treated?). Or, may be just show some of the results in figure S4C and E in the main text.

      We thank the Reviewer for this comment. We agree that it will be better to move S4C into the main figure and we will action this point in the revised version of the manuscript.

      (4) Why pooling the 3 independent experiment together? Looking at the data in Figure S4, it seems that one treated sample is very similar to the control, thus weakening the conclusion. The replicates in this figure are biological replicates. Yet the papers present 4/5 different cell lines, so why only 3 of them are used here? Is there some explanation regarding the outsider (cell line age, number of division etc). Might be worth adding data from the other cell lines (1 or 2 more).

      We thank the Reviewer for this point. The experiment shown in figure S4E has been performed with one cell line (N5) and independent experimental replicates were assessed for the statistical analysis. We are not sure why there appears to be an outlier in some cases, and this is why it was important to replicate this experiment three times. However, we will also repeat this experiment with another cell line applying more stringent conditions to strengthen this point.

      (5) Figure 4 H and I: What are the statistic actually comparing: treated v untreated for each cell lines or different cell lines against each other? If the former, then how is it possible to have a 139 fold change with such a weak p value of 0.042? If the latter, then why is a p-value given for each of the 3 cell lines? Also, the number and source of replicates is unclear - N=3 is stated, so was each cell line done in triplicate? If so, how many fields per replicate?

      We are happy to clarify this point for the Reviewer. The statistical analysis compares treated vs. untreated samples within the same genotype. The high fold change observed is likely due to the large standard deviation of the dataset, which was also highlighted as raw data in the figure panel (bottom part of each picture in white colour font). For this reason, we have repeated this experiment multiple times and validated it across three independent cell lines.

      Would additional experiments be essential to support the claims of the paper? Request additional experiments only where necessary for the paper as it is, and do not ask authors to open new lines of experimentation. It would important to also show the migratory capacity of these cells in vivo.

      Are the suggested experiments realistic in terms of time and resources? It would help if you could add an estimated cost and time investment for substantial experiments. Human muscle cells engraftment and tracking in immunodeficient mice could be easily done. Engrafted muscle can be harvested 2-3 weeks after engraftment, and measurement of the distance from the engraftment point could be done (Site of injection could be labelled with tattoo die). This would be a month/month and half of work. Immunodeficient mice would cost around £1500 (n=6 mice per group => total of 12 mice) plus the cost of housing.

      Are the data and the methods presented in such a way that they can be reproduced? Are the experiments adequately replicated and statistical analysis adequate? See comments in first paragraph. The authors should probably be able to answer easily to the different concerns raised above.

      We thank the Reviewer for these comments. We agree that the suggested in vivo experiment might strengthen our work and we are currently sourcing all required materials to perform it. Additionally, we will perform a similar, quasi-vivo, experiment to study migration in a species-specific setting by delivering cells in 3D models in vitro (e.g. Maffioletti SM et al., Cell Reports 2018). This strategy will provide a solid alternative to the in vivo assay, in the eventuality that the xenogeneic setting will limit the resolution of the proposed transplantation experiment.

      Minor comments: typo "Onthology" should be "Ontology" in figure 2E. Some of the data in Figure S4E should be moved to the main text.

      Thank you for highlighting these minor comments. We will correct the typo and move data from figure S4 into the main figure 4.

      Reviewer #2 (Evidence, reproducibility and clarity): In this manuscript, Ferrari and colleagues provide solid data indicating that the Notch ligand DLL4 and PDGF-BB regulate the migration of myogenic progenitors derived from human pluripotent stem cells (PSC). These studies built from recent work by the same group (Gerli et al, Stem Cell Reports, 12:461, 2019), in which the authors documented that Notch and PDGF-BB signaling enhances migration and expression of stem cell markers while inducing perivascular cell features in muscle satellite cells. Here the authors perform similar in vitro studies in PSC-derived myogenic progenitors and conclude that the same effect is observed in this population of cells. The results are clear and well presented.

      Throughout the manuscript, the authors emphasize the importance of such findings for the future therapeutic application of a PSC-based therapy to treat patients with muscular dystrophy since multiples skeletal muscles need to be targeted in this group of diseases. Unfortunately, the authors do not provide transplantation data. The results would be highly meaningful if they show that observed in vitro changes (transcriptomes and chamber assay) result in meaningful migration in vivo using the systemic delivery, but as it is, the data do not support the claims and conclusions.

      We thank Reviewer 2 for their comments. We were pleased to read that they found our study and data solid, clear and well presented. Although we agree with the Reviewer that in vivo evidence would strengthen our findings, we would like to highlight that our study did not aim to be a translational investigation of the therapeutic potential of treated hPSC derivatives for muscle cell therapy (we believe our manuscript’s title reflects this). We see this work more as a foundational study to establish the required evidence for future, follow up transplantation studies focused on the therapeutic potential of this approach (something requiring a dedicated project, funding and months/years of work).

      Moreover, we believe that xenogeneic transplants are of limited use to investigate a complex species-specific phenomenon such as transendothelial cell migration. For this very reason we moved back to intraspecific transplantsin past studies (e.g.. Tedesco et al Sci Transl Med 2012). However, as a key aim of our study is to obtain data specific to human cells and given that we already performed mouse-in-mouse in vivo intra-arterial delivery experiments using DLL4 and PDGFBB treated primary cells in Gerli et al. Stem Cell Reports 2018, we are therefore proposing and planning to:

      • Test transendothelial migration with another quasi-vivo microfluidic assay orthogonal to the reported transwell experiments. This will model intraspecific (i.e., human-in-human) transendothelial migration under flow conditions.
      • Assess evidence of migration in human 3D muscles setting up a novel invasion assay in our in vitro 3D muscle models.
      • Perform intramuscular delivery of treated vs. untreated cells as per Reviewer 1 request to assess migration in skeletal muscle in vivo. This approach will optimise in vivo experiments in a 3Rs compliant fashion, avoiding invasive procedures in animals to study intravascular delivery.

      Reviewer #2 (Significance): Significance is limited if only in vitro data are provided. However if the authors are able to show enhanced engraftment upon systemic transplantation of human PSC-derived myogenic progenitors upon DLL4 and PDGF-BB treatment, the significance would be high.

      Please see our reply to the previous point.

      In terms of existing literature, there are publications reporting systemic delivery of murine PSC-derived myogenic progenitors as well as transcriptome and in vitro migration studies. It would probably be appropriate to cite them.

      We apologies to the Reviewer for this oversight. We will add the following papers which include systemic delivery of murine PSC-derived myogenic progenitors as well as transcriptome and migration studies: Matthias N et al., Exp Cell Res 2015; Incitti T et al., PNAS 2019.

      If systemic engraftment is observed, the manuscript would be of interest to the skeletal muscle and stem cell biology/regenerative medicine community.

      Please see our reply to the initial point.

      Reviewer #3 (Evidence, reproducibility and clarity):

      In this manuscript, the authors exploited the signal-mediated activation of NOTCH and PDGF pathways, by one week-long delivery of DLL4 and PDGF-BB to cultures of hiPSC-derived myogenic progenitors in vitro, to improve their migration ability. They performed transcriptomic and functional analyses across human and mouse primary muscle stem cells and human hiPSC-derived myoblasts, including genetically corrected hiPSC derivatives, to show that DLL4 and PDGF-BB treatment modulates pathways involved in cell migration, including enhanced trans-endothelial migration in transwell assays.

      The increased migratory ability, and in particular enhancing extravasation, is a fundamental property required for optimal performance of hiPSC myogenic derivatives, upon their intravascular delivery; hence, the finding reported here are of extremely high potential interest in term of solution of one of the major bottle-neck of cell therapy. However, there are important issues that need to be resolved by the authors with additional experimentation, that I recommend performing, in order to improve this manuscript.

      We sincerely thank the Reviewer for acknowledging the extremely high relevance and potential of our paper for muscle gene and cell therapies and for providing constructive feedback to improve our manuscript.

      1) The most critical issue here is that the authors fail to provide evidence that DLL4/PDGF-BB-treated cultures of hiPSC-derived myogenic progenitors do not lose their myogenic potential and are able to form myotubes, upon interruption of treatment. It would be also important to determine when (how many days after withdrawal of DLL4/PDGF-BB) the full myogenic properties of these cells are recovered. From the RNAseq datasets shown by the authors, it appears that DLL4/PDGF-BB-treated hiPSC-derived myoblasts do not express the key genes of myogenic identity (MyoD) and early differentiation (myogenin), while expressing genes of mesenchymal/vessel-derived lineages. It is imperative that the authors show that these changes are reversible, upon withdrawal of DLL4/PDGF-BB. This should be show by an unbiased transcriptomic analysis (RNAseq) of hiPSC-derived myoblasts after withdrawal of DLL4/PDGF-BB, that should be integrated with functional evidence showing that these cells can resume their ability to form differentiated myotubes, upon exposure to myogenic culture cues in vitro.

      We thank the Reviewer for this comment. We agree that this is an important and feasible experiment which will add important information to our work. We performed similar work in our previous study and already observed phenotype reversion of treated cells upon release of the stimuli within a few passages in cultures. However, we agree that this requires systematic assessment and quantification. To this aim, we will assess the reversibility of the DLL4 & PDGF-BB effect by stopping treatment at day 7 and then assessing skeletal myogenic differentiation capacity of target cells at sequential passages and time points post-treatment. Analysis of the differentiation index at different time points will provide functional evidence on the myogenic potential of hiPSC-derived myogenic progenitors post-withdrawal of DLL4 & PDGF-BB. We believe that the Reviewer’s suggestion for transcriptomic analysis via RNA-seq might be overly costly for the purpose of identifying the myogenic potential of treated cells post-withdrawal of treatment, and that qPCR panels alongside immunofluorescence staining may be sufficient.

      2) A parallel evidence in vivo should be also provided, showing that DLL4/PDGF-BB-treated hiPSC-derived myoblasts do not express MyoD and myogenin when delivered intravascularly, but regain their expression after they have crossed the vessel endothelium and have entered the skeletal muscles.

      We thank the Reviewer for suggesting this experiment. We agree that this would be a very interesting point to address; however, it might be very challenging to address this question with the proposed in vivo experiment. Nonetheless, we believe that with a combination of in vitro and in vivo assays we will be able to satisfactorily answer the question: Do DLL4 and PDGF-BB-treated myogenic progenitors re-gain myogenic potential upon entering skeletal muscle tissue? To this aim, we aim to analyse muscles following intramuscular transplantation of treated and untreated cells. Moreover, to model intra-vascular delivery and have high resolution imaging, we aim to adapt a microfluidic platform to perform trans-endothelial assays and selectively differentiate cells that successfully cross the blood vessel layer. Although likely to be very challenging, we will attempt to capture or stain those very cells in order to assess the expression of myogenic markers as requested by the Reviewer.

      If these experiments could firmly demonstrate that DLL4/PDGF-BB-treatment reversibly promotes migratory properties of hiPSC-derived myoblasts (as predicted, but not demonstrated in previous works from the same group, using mouse or human primary muscle stem cells - Cappellari et al. 2013; Gerli et al. 2019), then this work could be a great interest in term of basic and translational biology and clearly suitable for publication in a top journal.

      We thank the Reviewer for this constructive feedback and for seeing the great potential of our work in terms of basic and translational biology. We assume there was a typo in the sentence in brackets with a missing “as” (“..not demonstrated as in previous work...”): we indeed demonstrated the effect of DLL4 and PDGFBB in vivo extensively in our previous work.

      Other points:

      • Fig. 2A. it looks like there are some outlier RNAseq sample replicates that might negatively impact at the statistical level on the subsequent analysis. This issue is likely due to the heterogeneity of the samples (both untreated and treated) and could be resolved by replacing outlier samples with new replicates.

      We thank the Reviewer for this comment. Although we agree that replacing those samples with new replicates might improve our statistical analyses, this will be financially challenging at this stage and perhaps also not completely reflecting the real variability of the experimental setup.

      • Along the same line as above, sample heterogeneity following treatment might be resolved by a better understanding of optimal doses of DLL4/PDGF-BB and time of exposure, which I recommend the authors to define by additional experiments.

      We thank the Reviewer for this comment. This is a potentially interesting experiment, which we have not performed as we took advantage of previous knowledge and dose-response on primary mouse and human myoblasts. Overall, we believe that this experiment might not be strictly required at this stage, given that we have already solid evidence of response in hiMPs with a defined concentration and exposure time of DLL4 and PDGFBB.

      Reviewer #3 (Significance):

      If these experiments could firmly demonstrate that DLL4/PDGF-BB-treatment reversibly promotes migratory properties of hiPSC-derived myoblasts (as predicted, but not demonstrated in previous works from the same group, using mouse or human primary muscle stem cells - Cappellari et al. 2013; Gerli et al. 2019), then this work could be a great interest in term of basic and translational biology and clearly suitable for publication in a top journal and could be interesting for a wide audience in regenerative medicine.

      We thank the Reviewer once again for this constructive feedback and for seeing the great potential of our work in terms of basic and translational biology, as well as for regenerative medicine.

      Please note that the following statement will be added to the Acknowledgements section of our revised manuscript: "For the purpose of Open Access, the author has applied a CC BY public copyright licence to any Author Accepted Manuscript version arising from this submission. This work was supported by the Francis Crick Institute which receives its core funding from Cancer Research UK, the UK Medical Research Council, and the Wellcome Trust (FC001002)".

      Once again, we sincerely thank all Reviewers for their positive, constructive and insightful comments, which motivate us to further improve our work. We also thank the Review Commons editorial team for guidance and assistance.

      Prof. Francesco Saverio Tedesco, University College London and The Francis Crick Institute, London, UK.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      In this manuscript, the authors exploited the signal-mediated activation of NOTCH and PDGF pathways, by one week-long delivery of DLL4 and PDGF-BB to cultures of hiPSC-derived myogenic progenitors in vitro, to improve their migration ability. They performed transcriptomic and functional analyses across human and mouse primary muscle stem cells and human hiPSC-derived myoblasts, including genetically corrected hiPSC derivatives, to show that DLL4 and PDGF-BB treatment modulates pathways involved in cell migration, including enhanced trans-endothelial migration in transwell assays. The increased migratory ability, and in particular enhancing extravasation, is a fundamental property required for optimal performance of hiPSC myogenic derivatives, upon their intravascular delivery; hence, the finding reported here are of extremely high potential interest in term of solution of one of the major bottle-neck of cell therapy. However, there are important issues that need to be resolved by the authors with additional experimentation, that I recommend performimg, in order to improve this manuscript.

      1) The most critical issue here is that the authors fail to provide evidence that DLL4/PDGF-BB-treated cultures of hiPSC-derived myogenic progenitors do not lose their myogenic potential and are able to form myotubes, upon interruption of treatment. It would be also important to determine when (how many days after withdrawal of DLL4/PDGF-BB) the full myogenic properties of these cells are recovered. From the RNAseq datasets shown by the authors, it appears that DLL4/PDGF-BB-treated hiPSC-derived myoblasts do not express the key genes of myogenic identity (MyoD) and early differentiation (myogenin), while expressing genes of mesenchymal/vessel-derived lineages. It is imperative that the authors show that these changes are reversible, upon withdrawal of DLL4/PDGF-BB. This should be show by an unbiased transcriptomic analysis (RNAseq) of hiPSC-derived myoblasts after withdrawal of DLL4/PDGF-BB, that should be integrated with functional evidence showing that these cells can resume their ability to form differentiated myotubes, upon exposure to myogenic culture cues in vitro.

      2) A parallel evidence in vivo should be also provided, showing that DLL4/PDGF-BB-treated hiPSC-derived myoblasts do not express MyoD and myogenin when delivered intravascularly, but regain their expression after they have crossed the vessel endothelium and have entered the skeletal muscles. If these experiments could firmly demonstrate that DLL4/PDGF-BB-treatment reversibly promotes migratory properties of hiPSC-derived myoblasts (as predicted, but not demonstrated in previous works from the same group, using mouse or human primary muscle stem cells - Cappellari et al. 2013; Gerli et al. 2019), then this work could be a great interest in term of basic and translational biology and clearly suitable for publication in a top journal.

      Other points:

      • Fig. 2A. it looks like there are some outlier RNAseq sample replicates that might negatively impact at the statistical level on the subsequent analysis. This issue is likely due to the heterogeneity of the samples (both untreated and treated) and could be resolved by replacing outlier samples with new replicates.
      • Along the same line as above, sample heterogeneity following treatment might be resolved by a better understanding of optimal doses of DLL4/PDGF-BB and time of exposure, which I recommend the authors to define by additional experiments.

      Significance

      If these experiments could firmly demonstrate that DLL4/PDGF-BB-treatment reversibly promotes migratory properties of hiPSC-derived myoblasts (as predicted, but not demonstrated in previous works from the same group, using mouse or human primary muscle stem cells - Cappellari et al. 2013; Gerli et al. 2019), then this work could be a great interest in term of basic and translational biology and clearly suitable for publication in a top journal and could be interesting for a wide audience in regenerative medicine.

      Expertise of this reviewer:

      Muscle regeneration; Muscular Dystrophies; Signaling and Epigenetics

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      In this manuscript, Ferrari and colleagues provide solid data indicating that the Notch ligand DLL4 and PDGF-BB regulate the migration of myogenic progenitors derived from human pluripotent stem cells (PSC). These studies built from recent work by the same group (Gerli et al, Stem Cell Reports, 12:461, 2019), in which the authors documented that Notch and PDGF-BB signaling enhances migration and expression of stem cell markers while inducing perivascular cell features in muscle satellite cells. Here the authors perform similar in vitro studies in PSC-derived myogenic progenitors and conclude that the same effect is observed in this population of cells. The results are clear and well presented.

      Throughout the manuscript, the authors emphasize the importance of such findings for the future therapeutic application of a PSC-based therapy to treat patients with muscular dystrophy since multiples skeletal muscles need to be targeted in this group of diseases. Unfortunately, the authors do not provide transplantation data. The results would be highly meaningful if they show that observed in vitro changes (transcriptomes and chamber assay) result in meaningful migration in vivo using the systemic delivery, but as it is, the data do not support the claims and conclusions.

      Significance

      Significance is limited if only in vitro data are provided. However if the authors are able to show enhanced engraftment upon systemic transplantation of human PSC-derived myogenic progenitors upon DLL4 and PDGF-BB treatment, the significance would be high.

      In terms of existing literature, there are publications reporting systemic delivery of murine PSC-derived myogenic progenitors as well as transcriptome and in vitro migration studies. It would probably be appropriate to cite them.

      If systemic engraftment is observed, the manuscript would be of interest to the skeletal muscle and stem cell biology/regenerative medicine community.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary:

      The paper presented by Ferrari et al., aims to improve the migration capacity of hiPSC- derived myogenic progenitors. For this purpose, the authors used a previously published well characterized hiMPs model and focussed on the modulation of NOTCH and PDGF signaling pathways. The rational to target these pathways was based on muscle cells migrations molecular events observed during developmental described in the literature.

      Major comments:

      • Are the key conclusions convincing? This is a very interesting paper. Few clarifications as suggested below need to be done before being fully convincing. Enrichment test and heat maps and the network analysis are not well explained in terms of which genes were selected and why, and in terms of which gene set were selected and why. In some cases, the information may be given in the paper, but it is not easy for the reader to find it. It should be stated more clearly. For example, in Fig2C why these eight were chosen for the heat maps and why not other genes known to be involved in myogenesis, cell migration etc. Similar comment for figure 3 A, D and G. Another example, in Fig 2E, on what basis are some gene sets chosen to be shown in this figure when there are many more significant in the supplementary table. Figure 4F is impossible to interpret without a clear description of how the subnetwork is extracted, was a list of gene list submitted to string, if so which genes and why? Secondly, why are there many nodes with no edges? Is it all of the nodes that are in that GO-Term, if so it needs to be clarified? Was this the most strongly deregulated go-Term according to string analysis? Figure 4 B, C, D and E:

      (1) The authors should clarify what figure 4B is? Is 1,2,3,4 different time point? Treated or untreated cells?

      (2) Figure C: Is the graph showing the cell distribution of both treated and untreated cells? If yes is it possible to give a different shape for the control cells and see if indeed more control green shape would be observed in this plot? (In the supplementary data there is the distribution showing the treated v untreated, but the clusters are not visible)

      (3) Would it be possible to take some of the parameters in Figure 4D and show the distribution in treated vs untreated and perform the statistical analysis? (eg is there a significant difference for the parameter total distance between control and treated?). Or, may be just show some of the results in figure S4C and E in the main text.

      (4) Why pooling the 3 independent experiment together? Looking at the data in Figure S4, it seems that one treated sample is very similar to the control, thus weakening the conclusion. The replicates in this figure are biological replicates. Yet the papers present 4/5 different cell lines, so why only 3 of them are used here? Is there some explanation regarding the outsider (cell line age, number of division etc). Might be worth adding data from the other cell lines (1 or 2 more).

      (5) Figure 4 H and I: What are the statistic actually comparing: treated v untreated for each cell lines or different cell lines against each other? If the former, then how is it possible to have a 139 fold change with such a weak p value of 0.042? If the latter, then why is a p-value given for each of the 3 cell lines? Also, the number and source of replicates is unclear - N=3 is stated, so was each cell line done in triplicate? If so, how many fields per replicate?

      • Would additional experiments be essential to support the claims of the paper? Request additional experiments only where necessary for the paper as it is, and do not ask authors to open new lines of experimentation. It would important to also show the migratory capacity of these cells in vivo. -Are the suggested experiments realistic in terms of time and resources? It would help if you could add an estimated cost and time investment for substantial experiments. Human muscle cells engraftment and tracking in immunodeficient mice could be easily done. Engrafted muscle can be harvested 2-3 weeks after engraftment, and measurement of the distance from the engraftment point could be done (Site of injection could be labelled with tattoo die). This would be a month/month and half of work. Immunodeficient mice would cost around £1500 (n=6 mice per group => total of 12 mice) plus the cost of housing.
      • Are the data and the methods presented in such a way that they can be reproduced? Are the experiments adequately replicated and statistical analysis adequate?

      See comments in first paragraph. The authors should probably be able to answer easily to the different concerns raised above.

      Minor comments:

      typo "Onthology" should be "Ontology" in figure 2E. Some of the data in Figure S4E should be moved to the main text.

      Significance

      Describe the nature and significance of the advance, existing literature, audience: Generating iPSC cell lines with an improved capacity to migrate will be of high interest for the neuromuscular field, and could be a potential therapeutic strategy applicable for many neuromuscular disorders.

      Muscle cell engraftment is quite challenging as the capacity of these cells to populate different muscles is very poor. Improving the cell migration, survival and proliferation may thus help to improve the muscle cell engraftment strategy.

      Expertise:

      I have an expertise in neuromuscular disorders, muscle stem cells (human and murine, in vitro and in vivo), as well as an expertise in omics analysis.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Point-by-point response to reviewer comments


      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      In the current manuscript, Millarte et al reports a novel role of Rabaptin5 in selectively clearing damaged endosomes via canonical autophagy. They have identified FIP200 as a novel interactor of Rabaptin5 under basal conditions using yeast-two hybrid screening and further confirmed the interaction of Rabaptin5 with FIP200 with immunoprecipitation. They next used Chloroquine and monitored colocalization of the Rabaptin5 with WIPI2, ATG16L1 and LC3B to demonstrate the potential interaction of Rabaptin5 with the autophagic machinery. They have primarily used Gal-3 as a marker of membrane damage after 30 minutes of Chloroquine treatment. In order to further elucidate the role of Rabaptin5 in autophagic induction mediated by Chloroquine, they have silenced Rabaptin5, FIP200, ULK1 and ATG13 and observed a decrease in the number of LC3 or WIPI2 autophagosome formation. Based on these observations they tested if Rabaptin5 interacts with ATG16L1 upon Chloroquine treatment and confirmed their interaction with potential interaction sites of both Rabaptin5 with ATG16L1 with IP. The authors confirmed the interaction of Rabaptin5 with ATG16L1 by complementing the KO line with the mutant form of Rabaptin5 containing alanine residues in its consensus motif. Finally, they have used Salmonella and SCV as a model to study the role of Rabaptin5 in endomembrane damage and monitored a 50% decrease in the removal of Salmonella in Rabaptin5 KO or KD cells.

      Major concerns One of the major concerns is the membrane damage reported by chloroquine which is known to induce lysosomal swelling and further targeting of the swollen compartments to degradation by direct conjugation of LC3 onto single membrane as a form of non-canonical autophagy. The evidence regarding membrane damage by Gal3 colocalization on the Rabaptin5 vesicles is preliminary. As suggested by the authors the canonical autophagy pathway recognizing damaged membranes recruits also ALIX to the damaged membrane which was not observed in Supplementary Figure 2. The link to membrane damage by chloroquine and monensin with Rabaptin5 is not convincing as there is not sufficient evidence of membrane damage. In relation to this issue authors should consider using other damage markers as Gal8, p62 or NDP52 to provide additional claim with respect to membrane damage induced by chloroquine.

      To expand on the question of CQ treatment damaging early endosomes, we also tested for Gal8 on Rabaptin5-positive enlarged endosomes and quantified the fraction of Rabaptin5-positive rings positive for Gal3 and Gal8 after 30 min of CQ treatment. We propose to include this data in Figure 2:

      • *

      *

      • *

      We have tested the importance of Gal3 and p62 by siRNA-mediated knockdown where we found a robust inhibition of induction of WIPI2 puncta with CQ, but not with Torin1. Formation of LC3 puncta was less reduced, similar to knockdowns of FIP200, ATG13, or Rabaptin5.

      We propose to add these knockdown experiments as a supplementary figure:

      • *

      • *

      *

      One of the main claims here is that Rabaptin5 regulates the targeting of damaged endosomes to autophagy. Clearly, these are early endosomes as stated in the abstract. However, the evidence presented here showing these are early endosomes is not convincing. Analysing Gal3 and Gal8 positive vesicles that are Rabaptin5 positive and an early endosomal marker will be important in this context. For example, there need to be additional evidence showing that early endosomes are targeted to autophagy. Is the degradation of TfR affected by this targeting? Did the authors look at the effect of Bafilomycin A1? If this process affects exclusively early endosomes, it should be BafA1 independent. This will direct more into the cellular function of this process.

      Rabaptin5 is a bona fide marker of Rab5-positive early sorting endosomes. As a control, we confirmed colocalization of Rabaptin5 with transferrin receptor, another endosomal marker, on CQ-induced rings (Fig. 2B). We now also analyzed swollen endosomes with triple-staining for Rabaptin5, transferrin receptor, and Gal3 as shown in this gallery (30 min CQ, as in Fig. 2). All Rabaptin5-positive swollen endosomes (rings) were positive for transferrin receptor and ~80% for mCherry-Gal3.

      • *

      *

      • *

      We further tested transferrin receptor levels with and without CQ. Since CQ inhibits autophagic flux, this assay may not be very sensitive. Nevertheless, we found a significant reduction of ~15% and ~30% after overnight incubation with CQ in parental HEK293A cells and in Rbpt5-KO cells re-expressing wild-type Rabaptin5, resp., but no reduction in Rbpt5-KO cells expressing the Rabaptin5-AAA mutant defective in binding to ATG16L1:

      • *

      *

      • *

      As to the effect of BafA1, see our general response on top. The osmotic effect of CQ or Mon on endosomes that leads to membrane breakage requires an acidic pH. Preincubation with BafA1 neutralizes the pH, prevents osmotic swelling by CQ/Mon, and was shown to block LC3 lipidation (Florey et al., 2015, Jacquin et al., 2017). When BafA1 was added simultaneously, CQ was found to induce LC3 despite the presence of BafA1 (Mauthe et al., 2018), and Mon was shown to still be able to break endosomal membranes and recruit LC3 to EEA1-positive endosomes (Fraser et al., 2019). However, CQ-induced LC3 recruitment to latex bead-containing phagosomes or entotic vacuoles, i.e. LAP-like autophagy, was blocked (Florey et al., 2015). Consistent with this literature, we found increased LC3B lipidation already within 30 min of CQ treatment independently of BafA1 (no preincubation).

      • *

      *

      • *

      Upon longer incubations, LC3B lipidation is very strong already with BafA1 alone so that the effect of CQ cannot be assessed anymore, since both drugs inhibit autophagic flux.

      Furthermore, we found a CQ-dependent increase in WIPI2- and LC3B-positive puncta to be insensitive to BafA1 (panel A below). Colocalization of Rabaptin5 to LC3B and LC3B to Rabaptin5 significantly increased upon CQ treatment independently of the presence of BafA1 (no pretreatment), indicating that at least a large part of CQ-induced LC3B puncta is not due to LAP-like autophagy.

      • *

      *

      Minor concerns Both for Figure 2 and Supplementary Figure 7 it will be clearer to have the images in colour rather than black and white for better interpretation.

      We thought the grayscale images were clearer, but are happy to provide color images.

      The interaction of FIP200 and ATG16L1 with Rabaptin5 is well characterized with immunoprecipitation and imaging but the interaction of Rabaptin5 in presence of chloroquine with FIP200 and ATG16L1 DWD are missing and it will be important to include if in the presence of chloroquine these interactions will increase or not.

      We can do co-IP experiments also upon CQ treatment.

      In order to further support the role of Rabaptin5 for LC3 lipidation upon chloroquine induced membrane damage, western blots of WT, +Rabaptin5, Rabaptin5 KO, Rabaption5 KO +WT or +AAA cell lines were analysed. However, the lysates were collected upon 30 minutes of chloroquine treatment which does not correlate with the imaging performed in Figure 2 as the number of LC3 vesicles did not show an increase upon 30 minutes of chloroquine treatment. The authors should include the 150 minutes time point for the LC3 lipidation in these conditions.

      Because CQ inhibits autophagic flux, LC3-II accumulates after longer times in all cell lines. The differences can only be seen early.

      The experiments with Salmonella are of great quality. The relationship of Rabaptin5 with SCV and the endomembrane damage induced by Salmonella could be further elucidated with Rabaptin5 positive vesicles at early infection stages. It is not very clear from the text how authors link the endosomal network previously described for chloroquine with infection. It would be important here to show that Salmonella mutants unable to damage endosomal membranes do not have an effect. In Figure 7 panel C, the time points on graphs are in hours but it should be in minutes. corrected.

      Since Salmonella require T3SS for infection of HEK cells and T3SS causes the membrane damage, the proposed experiment is difficult.

      The events of targeting the damaged membranes for degradation was well characterized by the recognition of these membranes by Gal3, Gal8 and recruitment of autophagic receptors to the site of damage (Chauhan et al. 2016; Jia et al. 2019; Thurston et al. 2012; Maejima et al. 2013; Kreibich et al. 2015). This manuscript introduces a new potential platform for the formation of autophagic machinery on endosomes with the interaction of Rabaptin5 with FIP200 and ATG16L1, however more evidence is required to link this to the clearance of damaged membranes. Previously it was shown that endolysosomal compartments that were neutralized and swollen by monensin and chloroquine had been directed to degradation by direct conjugation of LC3 to single membranes via noncanonical autophagy, but here authors propose another mechanism for this event via canonical autophagy.

      As discussed in the general response above, the literature reports CQ and Mon to initiate both canonical autophagy and LAP-like autophagy, the latter particularly on phagosomes containing latex beads or entotic vacuoles. Our results – including the additional data above –concern the effects of CQ and Mon damaging early endosomes and causing recruitment of galectins and ubiquitination, triggering autophagy dependent on the ULK complex and WIPI2 as hallmarks of canonical autophagy, and Rabaptin5. The reviewer comments highlighted the possibility of LAP-like autophagy occurring in parallel, perhaps on endosomes that are not broken, which might explain the relative insensitivity of LC3 puncta induced by CQ and Mon – compared to the strong and robust reduction of WIPI2 puncta – on the knockdown of FIP200, ATG13, or Rabaptin5. In an alternative explanation, inhibition of autophagic flux causes remaining canonical autophagy to accumulate, while WIPI2 puncta are strongly inhibited. In support of the latter interpretation, ULK inhibition by MRT68921 (Fig. 4C and D) or FIP200 knockout (Fig. 6B and C) abolished CQ-induced LC3 structures, suggesting that – unlike on phagosomes or entotic vacuoles – there is little LAP-like autophagy. We propose to revise the manuscript to discuss these considerations more clearly.

      Reviewer #1 (Significance (Required)):

      Overall this work is very novel and shows some evidence of early endosomal autophagy. It could be relevant for some for of receptor-mediated signalling (although it is not discussed by the authors) My experience is in intracellular trafficking of pathogens and membrane damage.

      **Referee Cross-commenting**

      In my opinion, the only way you can distinguish between double or single membrane is by EM. For me, the important part is to show this is targeting of early endosomes to autophagy, either using other early endosomal markers, analysing by WB some early endosome receptors such as TfR or other inhibitors. If the authors are able to address some these comments, I agree the paper will be in a better position for publication.


      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      Millarte et al study the role of Radaptin-5 (Rbpt5) during early endosome damage recognition by autophagy. The authors focus on using chloroquine (CQ) as an agent to induce endosomal swelling/damage and suggest that Rbpt5 is required for the recruitment of the autophagy machinery to perturbed endosomes. They further use salmonella infection as a model to confirm the role of Rbpt5 in this process. The authors initially show that Rbpt5 binds to FIP200 and subsequently focus on its interaction with ATG16L1 and identify a mutant that is unable to bind ATG16L1 or mediate the recognition of early endosomes by autophagy. Overall, this is an interesting study which provides molecular insights of how early endosomes can be targeted by the autophagy machinery where Rbpt5 may act as an autophagy receptor. Some specific comments are as follows:

      Fig.3A: siRbpt5 seems to induce the localization of LC3 to ring-like structures during CQ treatment. Are these LAP-like structures (e.g. sensitive to BafA1 treatment)? And were they included in the quantification in Fig.3C?

      Ring-like LC3 structures were also counted.

      As discussed in the general remarks above, it is a possibility that knockdown-resistent LC3 recruitment (particularly rings) is due to a CQ-induced LAP-like process. The alternative explanation is that there is residual canonical autophagy upon knockdown of Rabaptin5, ATG13, or FIP200: while WIPI2 puncta are strongly reduced, LC3-positive structures accumulate due to inhibition of autophagic flux. In support of the latter interpretation, ULK inhibition by MRT68921 (Fig. 4C and D) or FIP200 knockout (Fig. 6B and C) abolished CQ-induced LC3 puncta or rings.

      We can also test BafA1 treatment. Certainly, we will revise the text to discuss this point in more detail.

      • *

      Fig.4A&B: Since Rbpt5 KD has a weak effect on LC3 puncta formation (Fig.3) and to distinguish the effects of CQ in inducing LAP, the effects of ATG13 and ULK1 KD should be assessed by localising Rbpt5 with WIPI2 or ATG16L1.

      We can do that.

      Fig.4: It is not clear why ULK1 KD would affect Torin1-induced autophagy but not LC3/WIPI2 localisation during CQ induced early endosome-damage. As the ULK inhibitors can target other pathways, the authors should confirm this finding in ULK1/2 double KO or KD cells.

      We have used **MRT68921, because it is frequently used in the literature for this purpose with high specificity. It was used for example by Lystad et al. (2019) together with VPS34IN1 to block all canonical autophagy to analyze exclusively noncanonical effects of monensin treatment. We could perform ULK1/2 double knockdowns, but since ULK2 cannot be detected on immunoblots in HEK293 cells, the result would be interpretable only when there is an effect.

      Fig.5: The contribution of FIP200 in the interaction between Rbpt5 and ATG16L1 is unclear. Is binding between Rbpt5 and ATG16L1 mediated by ATG16L1's interaction with FIP200? The plasmid details describing the delta-WD40 deletion plasmid used in this study are missing and could be important to confirm that the detla-WD40 still retains binding to FIP200.

      We will of course include the details on the deletion plasmid, which were missing by mistake. Our WD deletion construct of ATG16L1 consists of residues 1–319, precisely deleting just the WD40 repeats, but retaining the FIP200 interaction sequence and the second membrane binding segment (b).

      We did a co-immunoprecipitation experiment and found both wild-type ATG16L1 and the ∆WD mutant to co-immunoprecipitate with FIP200:

      • *

      *

      Fig.5E: the authors should test Rbpt5 AAA mutant binding to FIP200. Since the mutant appears to express less, its binding to ATG16L1 should be quantified or repeated with more comparable expression levels.

      We will quantify the immunoblots and perhaps attempt getting more equal expression levels.

      Fig.6: CQ treatment can induce various endosomal damage (in addition to early endosomes) and LC3 lipidation processes (e.g. LAP-like). The authors show that Rbpt5 is specifically involved in damaged early endosome autophagy. In this figure, it would be important to distinguish CQ-induced LC3 puncta as a result of early endosome damage or other lipidation processes (e.g. canonical or non-canonical autophagy). The use of FIP200 KO cells shows that LC3 puncta is inhibited. However, here a specific readout to look at early endosome recognition by autophagy is important. The authors can localize early endosome markers (EEA1) with autophagy players (e.g. WIPI2 and LC3). This is also relevant to other figures (e.g. supplementary figure 7E).

      Rabaptin5 is a bona fide marker of Rab5-positive early sorting endosomes. As a control, we confirmed colocalization of Rabaptin5 with transferrin receptor, another endosomal marker, on CQ-induced rings (Fig. 2B). We also analyzed swollen endosomes with triple-staining for Rabaptin5/ transferrin receptor/ Gal3 as shown in this gallery (30 min CQ, as in Fig. 2). All Rabaptin5-positive swollen endosomes (rings) were positive for transferrin receptor and ~80% for mCherry-Gal3.

      • *

      *

      • *

      Our results are in agreement with Fraser et al. (2019) where they use EEA1 as an endosomal marker upon monensin treatment.

      We also performed a colocalization analysis for Rabaptin5 and LC3B, showing enhanced colocalization after CQ treatment for 150 min: ~20% of LC3B is (still) pos for Rabaptin5 after 150 min of CQ treatment:

      *

      Fig.6F&G: the authors should show representative images of these localization images quantified here. These can be added in the supplementary figures.

      We are happy to do this.

      **Minor comments:**

      Fig.2E: FIP200 seems to be highly overexpressed in this image. Commercial antibodies that recognise endogenous FIP200 are widely used and should be tested to confirm the colocalisation between FIP200 and Rbpt5.

      We plan to do this.

      Fig.7C image: the different setting denoted by +/-, +/+ ..etc are not clearly defined.

      We will improve this.

      Reviewer #2 (Significance (Required)):

      This is a interesting study and provides important mechanistic insights underlying the recognition of perturbed early endosomes by the autophagy machinery. Researchers interested in endosomal trafficking or autophagic substrate recognition are likely to benefit from this study.

      **Referee Cross-commenting**

      In my opinion, the authors have attempted to distinguish single membrane from double membrane LC3 lipidation by looking at the ULK complex requirement. As other reviewers suggested, this can be further confirmed by using ATG16L1 mutants. It is important however that these experiments are supplemented by co-localising autophagy proteins with alternative early endosome markers when Rbpt5 is inhibited.

      I think if the authors are able to address the suggested experiments, this would help improve the manuscript and make it suitable for publication.


      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      Millarte and colleagues find that Rabaptin5, a critical regulator of Rab5 activity, and a protein localized to early endosomes, interacts with FIP200 and ATG16L1. This interaction is confirmed and validated by a number of approaches (yeast 2 H, co-immunoprecipitation) and the binding sites of Rabaptin5 are mapped on FIP200 and ATG16L1. More precisely the binding site for ATG16L1 is nicely mapped on Rabaptin 5 by analogy with other ATG16L1 binders. The authors investigate the significance of this binding of Rabaptin5 to the autophagy proteins by proposing this interaction is required for targeting "autophagy to damaged endosomes". Endosomes are damaged with short treatments of chloroquine, a well studied compound previously shown to inhibit autophagy by disrupting fusion of autophagosomes with lysosomes. They propose the recruitment of autophagy (proteins) to the damaged endosomes may allow them to be eliminated. They use another model (phagocytosis of salmonella) to probe the role for rabaptin5 and its partners FIP200 and ATG16L1 in the well-documented role of autophagy on the elimination of salmonella in SCVs (Salmonella containing vacuole) formed from endosomes. Using short infection time points, and the Rabaptin5 mutants which prevent ATG16L1 binding they suggest Rabaptin5 binding contributes to elimination and killing of Salmonella by recruitment of ATG16L1.

      **Major comments:**

      1. The authors make an unfortunate and confusing choice of wording in the title and the text of "autophagy being recruited" to damaged early endosomes. A protein can recruit another protein but it can not recruit a process or pathway to a membrane.

      In the title we use the term "target". It is OK for us to avoid the expression "recruiting autophagy".

      The authors conclude that Rabaptin5 may have a role in autophagy directed to damaged early endosomes. The conclusion that Rabaptin5 binds FIP200 and ATG16L1 are convincing. The main issue is however in identifying what sort of process they are following. They have shown WIPI2 and LC3 can be recruited to early endosomes after 30 min chloroquine treatment but there is no data to explain the consequences of the binding of these proteins. They do not provide proof that canonical autophagosomes are formed which engulf and remove the damaged endosomes, nor do they show that the recruitment of WIPI2 is to a single membrane (presumably damaged early endosomes) which would be a non-canonical pathway. They often use the terminology "chloroquine-induced autophagy" (see Figure 4) but have virtually no proof they have induced either canonical or non-canonical pathways in their experiments. The only evidence they provide that there is some alteration in a membrane-mediated event is increase in lipidation of LC3 in Figure 6. The authors must follow either an early endosome protein or cargo to demonstrate lysosome-mediated degradation indicative of autophagy, or demonstrate the process is a variation on non-canonical autophagy.

      We analyzed transferrin receptor levels with and without CQ to test degradation of an early endosomal marker protein. Since CQ inhibits autophagic flux, this assay may not be very sensitive. Nevertheless, we found a significant reduction of ~15% and ~30% after overnight incubation with CQ in parental HEK293 cells and in Rbpt5-KO cells re-expressing wild-type Rabaptin5, resp., but no reduction in Rbpt5-KO cells expressing the Rabaptin5-AAA mutant defective in binding to ATG16L1:

      • *

      *

      There are concerns about the replicates done for many experiments in particular the co-immunoprecipitations which are not quantified (Figure 1 and 5).

      We will quantify these blots.

      The rescue experiments, even if done with stable cells lines made in the parental HEK293 cell line should be viewed with caution because of the very different amounts of Rabaptin5 (see Figure 6A). The overexpression of Rabaptin5 has not been well studied and comparisons with the mutants are therefore preliminary (Figure 6F and G).

      Fig 6A shows that Rabaptin5 levels are similar except for +Rbpt, where they are higher, and R-KO, which has none. Additional Rabaptin5 seems not to significantly enhance early WIPI and ATG16L1 colocalization.

      Conclusions about the role of the ULK complex, or ULK1 versus ULK2, should be expanded by studying the activity of the complex (phosphorylation of ATG13 for example) in order to make the conclusions more significant.

      We consider this to be beyond the scope of this study. Rabaptin5-dependent autophagy depends on the components of the ULK complex.

      **Minor comments:**

      1. Much of the labelling in the immunofluorescence images is not visible even on the screen version.

      We were careful to have the signals within the dynamic range of the image, but we can enhance the signals for better visibility.

      The LC3-lipidation experiment (Figure 6D) should be re-analysed by normalization to the loading control. The result may be significantly different and is open to re-interpretation. The quality of this western blot is also very poor.

      Quantitation was based on the ratio between LC3B-I and -II or the **percentage of II of the total, always within the same lane and therefore largely independent of loading.

      Reviewer #3 (Significance (Required)):

      This manuscript topic fits into the field of study of canonical versus non-canonical autophagy. This literature is best described as "LAP" first discovered by Doug Green, (Sanjuan in 2009) but more recently as a phenomena induced by monesin, and viral infection amongst others. Most relevant to this study are the references (in the text) by Florey (Autophagy 2015), Fletcher (EMBO J, 2018) and others. However, this manuscript fails to cite and consider the critical findings in a key study published by Lystad et al., Nature Cell Biology 2019, which examines the role of ATG16 in both canonical and non-canonical autophagy. The current study if placed into the context of the Lystad study would have significantly more value, and potentially make the findings more significant.

      We did not refer to Lystad et al. (2019), because they analyzed different ATG16L1 mutants on their contribution to monensin-induced processes on LC3 lipidation after completely blocking canonical autophagy with the ULK inhibitor MRT68921 and the VPS34 inhibitor VPS34IN1. The Rabaptin5-dependent CQ-induced processes are blocked by MRT68921 (Fig. 4C). We plan to refer to this study in the revision.

      Furthermore, the short chloroquine treatments used here could be of interest to the field if using the cited study of Mauthe et al., (which very clearly defines the effect of chloroquine after long (5 hrs treatment)) the authors would to revisit and repeat some of the key experiments in order to demonstrate the effects of 30 minute treatment. Does such short treatment block fusion? Does it affect the pH of the acidic compartments? Does it inactivate the endocytitic pathway? As the manuscript stands the lack of this understanding of the effect of chloroquine at short times, makes the observations difficult to be place into any biological context.

      This reviewer has expertise in autophagy, autophagosome formation and is familiar with the areas of endocytosis and infection.

      **Referee Cross-commenting**

      I think a major concern about the manuscript which is present in all reviews is the lack of clarity about what type of membrane LC3 is added to- is this the damaged endosome or a forming autophagosome? This leads to the question of what type of process is being observed here? non-canonical versus canonical autophagy.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Millarte and colleagues find that Rabaptin5, a critical regulator of Rab5 activity, and a protein localized to early endosomes, interacts with FIP200 and ATG16L1. This interaction is confirmed and validated by a number of approaches (yeast 2 H, co-immunoprecipitation) and the binding sites of Rabaptin5 are mapped on FIP200 and ATG16L1. More precisely the binding site for ATG16L1 is nicely mapped on Rabaptin 5 by analogy with other ATG16L1 binders. The authors investigate the significance of this binding of Rabaptin5 to the autophagy proteins by proposing this interaction is required for targeting "autophagy to damaged endosomes". Endosomes are damaged with short treatments of chloroquine, a well studied compound previously shown to inhibit autophagy by disrupting fusion of autophagosomes with lysosomes. They propose the recruitment of autophagy (proteins) to the damaged endosomes may allow them to be eliminated. They use another model (phagocytosis of salmonella) to probe the role for rabaptin5 and its partners FIP200 and ATG16L1 in the well-documented role of autophagy on the elimination of salmonella in SCVs (Salmonella containing vacuole) formed from endosomes. Using short infection time points, and the Rabaptin5 mutants which prevent ATG16L1 binding they suggest Rabaptin5 binding contributes to elimination and killing of Salmonella by recruitment of ATG16L1.

      Major comments:

      1. The authors make an unfortunate and confusing choice of wording in the title and the text of "autophagy being recruited" to damaged early endosomes. A protein can recruit another protein but it can not recruit a process or pathway to a membrane.
      2. The authors conclude that Rabaptin5 may have a role in autophagy directed to damaged early endosomes. The conclusion that Rabaptin5 binds FIP200 and ATG16L1 are convincing. The main issue is however in identifying what sort of process they are following. They have shown WIPI2 and LC3 can be recruited to early endosomes after 30 min chloroquine treatment but there is no data to explain the consequences of the binding of these proteins. They do not provide proof that canonical autophagosomes are formed which engulf and remove the damaged endosomes, nor do they show that the recruitment of WIPI2 is to a single membrane (presumably damaged early endosomes) which would be a non-canonical pathway. They often use the terminology "chloroquine-induced autophagy" (see Figure 4) but have virtually no proof they have induced either canonical or non-canonical pathways in their experiments. The only evidence they provide that there is some alteration in a membrane-mediated event is increase in lipidation of LC3 in Figure 6. The authors must follow either an early endosome protein or cargo to demonstrate lysosome-mediated degradation indicative of autophagy, or demonstrate the process is a variation on non-canonical autophagy.
      3. There are concerns about the replicates done for many experiments in particular the co-immunoprecipitations which are not quantified (Figure 1 and 5).
      4. The rescue experiments, even if done with stable cells lines made in the parental HEK293 cell line should be viewed with caution because of the very different amounts of Rabaptin5 (see Figure 6A). The overexpression of Rabaptin5 has not been well studied and comparisons with the mutants are therefore preliminary (Figure 6F and G).
      5. Conclusions about the role of the ULK complex, or ULK1 versus ULK2, should be expanded by studying the activity of the complex (phosphorylation of ATG13 for example) in order to make the conclusions more significant.

      Minor comments:

      1. Much of the labelling in the immunofluorescence images is not visible even on the screen version.
      2. The LC3-lipidation experiment (Figure 6D) should be re-analysed by normalization to the loading control. The result may be significantly different and is open to re-interpretation. The quality of this western blot is also very poor.

      Significance

      This manuscript topic fits into the field of study of canonical versus non-canonical autophagy. This literature is best described as "LAP" first discovered by Doug Green, (Sanjuan in 2009) but more recently as a phenomena induced by monesin, and viral infection amongst others. Most relevant to this study are the references (in the text) by Florey (Autophagy 2015), Fletcher (EMBO J, 2018) and others. However, this manuscript fails to cite and consider the critical findings in a key study published by Lystad et al., Nature Cell Biology 2019, which examines the role of ATG16 in both canonical and non-canonical autophagy. The current study if placed into the context of the Lystad study would have significantly more value, and potentially make the findings more significant.

      Furthermore, the short chloroquine treatments used here could be of interest to the field if using the cited study of Mauthe et al., (which very clearly defines the effect of chloroquine after long (5 hrs treatment)) the authors would to revisit and repeat some of the key experiments in order to demonstrate the effects of 30 minute treatment. Does such short treatment block fusion? Does it affect the pH of the acidic compartments? Does it inactivate the endocytitic pathway? As the manuscript stands the lack of this understanding of the effect of chloroquine at short times, makes the observations difficult to be place into any biological context.

      This reviewer has expertise in autophagy, autophagosome formation and is familiar with the areas of endocytosis and infection.

      Referee Cross-commenting

      I think a major concern about the manuscript which is present in all reviews is the lack of clarity about what type of membrane LC3 is added to- is this the damaged endosome or a forming autophagosome? This leads to the question of what type of process is being observed here? non-canonical versus canonical autophagy.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Millarte et al study the role of Radaptin-5 (Rbpt5) during early endosome damage recognition by autophagy. The authors focus on using chloroquine (CQ) as an agent to induce endosomal swelling/damage and suggest that Rbpt5 is required for the recruitment of the autophagy machinery to perturbed endosomes. They further use salmonella infection as a model to confirm the role of Rbpt5 in this process. The authors initially show that Rbpt5 binds to FIP200 and subsequently focus on its interaction with ATG16L1 and identify a mutant that is unable to bind ATG16L1 or mediate the recognition of early endosomes by autophagy. Overall, this is an interesting study which provides molecular insights of how early endosomes can be targeted by the autophagy machinery where Rbpt5 may act as an autophagy receptor. Some specific comments are as follows:

      Fig.3A: siRbpt5 seems to induce the localization of LC3 to ring-like structures during CQ treatment. Are these LAP-like structures (e.g. sensitive to BafA1 treatment)? And were they included in the quantification in Fig.3C?

      Fig.4A&B: Since Rbpt5 KD has a weak effect on LC3 puncta formation (Fig.3) and to distinguish the effects of CQ in inducing LAP, the effects of ATG13 and ULK1 KD should be assessed by localising Rbpt5 with WIPI2 or ATG16L1.

      Fig.4: It is not clear why ULK1 KD would affect Torin1-induced autophagy but not LC3/WIPI2 localisation during CQ induced early endosome-damage. As the ULK inhibitors can target other pathways, the authors should confirm this finding in ULK1/2 double KO or KD cells.

      Fig.5: The contribution of FIP200 in the interaction between Rbpt5 and ATG16L1 is unclear. Is binding between Rbpt5 and ATG16L1 mediated by ATG16L1's interaction with FIP200? The plasmid details describing the delta-WD40 deletion plasmid used in this study are missing and could be important to confirm that the detla-WD40 still retains binding to FIP200.

      Fig.5E: the authors should test Rbpt5 AAA mutant binding to FIP200. Since the mutant appears to express less, its binding to ATG16L1 should be quantified or repeated with more comparable expression levels.

      Fig.6: CQ treatment can induce various endosomal damage (in addition to early endosomes) and LC3 lipidation processes (e.g. LAP-like). The authors show that Rbpt5 is specifically involved in damaged early endosome autophagy. In this figure, it would be important to distinguish CQ-induced LC3 puncta as a result of early endosome damage or other lipidation processes (e.g. canonical or non-canonical autophagy). The use of FIP200 KO cells shows that LC3 puncta is inhibited. However, here a specific readout to look at early endosome recognition by autophagy is important. The authors can localize early endosome markers (EEA1) with autophagy players (e.g. WIPI2 and LC3). This is also relevant to other figures (e.g. supplementary figure 7E).

      Fig.6F&G: the authors should show representative images of these localization images quantified here. These can be added in the supplementary figures.

      Minor comments:

      Fig.2E: FIP200 seems to be highly overexpressed in this image. Commercial antibodies that recognise endogenous FIP200 are widely used and should be tested to confirm the colocalisation between FIP200 and Rbpt5.

      Fig.7C image: the different setting denoted by +/-, +/+ ..etc are not clearly defined.

      Significance

      This is a interesting study and provides important mechanistic insights underlying the recognition of perturbed early endosomes by the autophagy machinery. Researchers interested in endosomal trafficking or autophagic substrate recognition are likely to benefit from this study.

      Referee Cross-commenting

      In my opinion, the authors have attempted to distinguish single membrane from double membrane LC3 lipidation by looking at the ULK complex requirement. As other reviewers suggested, this can be further confirmed by using ATG16L1 mutants. It is important however that these experiments are supplemented by co-localising autophagy proteins with alternative early endosome markers when Rbpt5 is inhibited.

      I think if the authors are able to address the suggested experiments, this would help improve the manuscript and make it suitable for publication.

      Referee Cross-commenting

      In my opinion, the authors have attempted to distinguish single membrane from double membrane LC3 lipidation by looking at the ULK complex requirement. As other reviewers suggested, this can be further confirmed by using ATG16L1 mutants. It is important however that these experiments are supplemented by co-localising autophagy proteins with alternative early endosome markers when Rbpt5 is inhibited.

      I think if the authors are able to address the suggested experiments, this would help improve the manuscript and make it suitable for publication.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      In the current manuscript, Millarte et al reports a novel role of Rabaptin5 in selectively clearing damaged endosomes via canonical autophagy. They have identified FIP200 as a novel interactor of Rabaptin5 under basal conditions using yeast-two hybrid screening and further confirmed the interaction of Rabaptin5 with FIP200 with immunoprecipitation. They next used Chloroquine and monitored colocalization of the Rabaptin5 with WIPI2, ATG16L1 and LC3B to demonstrate the potential interaction of Rabaptin5 with the autophagic machinery. They have primarily used Gal-3 as a marker of membrane damage after 30 minutes of Chloroquine treatment. In order to further elucidate the role of Rabaptin5 in autophagic induction mediated by Chloroquine, they have silenced Rabaptin5, FIP200, ULK1 and ATG13 and observed a decrease in the number of LC3 or WIPI2 autophagosome formation. Based on these observations they tested if Rabaptin5 interacts with ATG16L1 upon Chloroquine treatment and confirmed their interaction with potential interaction sites of both Rabaptin5 with ATG16L1 with IP. The authors confirmed the interaction of Rabaptin5 with ATG16L1 by complementing the KO line with the mutant form of Rabaptin5 containing alanine residues in its consensus motif. Finally, they have used Salmonella and SCV as a model to study the role of Rabaptin5 in endomembrane damage and monitored a 50% decrease in the removal of Salmonella in Rabaptin5 KO or KD cells.

      Major concerns One of the major concerns is the membrane damage reported by chloroquine which is known to induce lysosomal swelling and further targeting of the swollen compartments to degradation by direct conjugation of LC3 onto single membrane as a form of non-canonical autophagy. The evidence regarding membrane damage by Gal3 colocalization on the Rabaptin5 vesicles is preliminary. As suggested by the authors the canonical autophagy pathway recognizing damaged membranes recruits also ALIX to the damaged membrane which was not observed in Supplementary Figure 2. The link to membrane damage by chloroquine and monensin with Rabaptin5 is not convincing as there is not sufficient evidence of membrane damage. In relation to this issue authors should consider using other damage markers as Gal8, p62 or NDP52 to provide additional claim with respect to membrane damage induced by chloroquine.

      One of the main claims here is that Rabaptin5 regulates the targeting of damaged endosomes to autophagy. Clearly, these are early endosomes as stated in the abstract. However, the evidence presented here showing these are early endosomes is not convincing. Analysing Gal3 and Gal8 positive vesicles that are Rabaptin5 positive and an early endosomal marker will be important in this context. For example, there need to be additional evidence showing that early endosomes are targeted to autophagy. Is the degradation of TfR affected by this targeting? Did the authors look at the effect of Bafilomycin A1? If this process affects exclusively early endosomes, it should be BafA1 independent. This will direct more into the cellular function of this process.

      Minor concerns Both for Figure 2 and Supplementary Figure 7 it will be clearer to have the images in colour rather than black and white for better interpretation.

      The interaction of FIP200 and ATG16L1 with Rabaptin5 is well characterized with immunoprecipitation and imaging but the interaction of Rabaptin5 in presence of chloroquine with FIP200 and ATG16L1 WD are missing and it will be important to include if in the presence of chloroquine these interactions will increase or not.

      In order to further support the role of Rabaptin5 for LC3 lipidation upon chloroquine induced membrane damage, western blots of WT, +Rabaptin5, Rabaptin5 KO, Rabaption5 KO +WT or +AAA cell lines were analysed. However, the lysates were collected upon 30 minutes of chloroquine treatment which does not correlate with the imaging performed in Figure 2 as the number of LC3 vesicles did not show an increase upon 30 minutes of chloroquine treatment. The authors should include the 150 minutes time point for the LC3 lipidation in these conditions.

      The experiments with Salmonella are of great quality. The relationship of Rabaptin5 with SCV and the endomembrane damage induced by Salmonella could be further elucidated with Rabaptin5 positive vesicles at early infection stages. It is not very clear from the text how authors link the endosomal network previously described for chloroquine with infection. It would be important here to show that Salmonella mutants unable to damage endosomal membranes do not have an effect. In Figure 7 panel C, the time points on graphs are in hours but it should be in minutes.

      The events of targeting the damaged membranes for degradation was well characterized by the recognition of these membranes by Gal3, Gal8 and recruitment of autophagic receptors to the site of damage (Chauhan et al. 2016; Jia et al. 2019; Thurston et al. 2012; Maejima et al. 2013; Kreibich et al. 2015). This manuscript introduces a new potential platform for the formation of autophagic machinery on endosomes with the interaction of Rabaptin5 with FIP200 and ATG16L1, however more evidence is required to link this to the clearance of damaged membranes. Previously it was shown that endolysosomal compartments that were neutralized and swollen by monensin and chloroquine had been directed to degradation by direct conjugation of LC3 to single membranes via noncanonical autophagy, but here authors propose another mechanism for this event via canonical autophagy.

      Significance

      Overall this work is very novel and shows some evidence of early endosomal autophagy. It could be relevant for some for of receptor-mediated signalling (although it is not discussed by the authors) My experience is in intracellular trafficking of pathogens and membrane damage.

      Referee Cross-commenting

      In my opinion, the only way you can distinguish between double or single membrane is by EM. For me, the important part is to show this is targeting of early endosomes to autophagy, either using other early endosomal markers, analysing by WB some early endosome receptors such as TfR or other inhibitors. If the authors are able to address some these comments, I agree the paper will be in a better position for publication.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      We would like to thank our reviewers for their critical reading and constructive comments. We have addressed all of their points and have included below a detailed reference to the changes we made accordingly. We have also added an additional supplemental figure.

      Reviewer #1 :

      **Major comments:**

      1. The authors highlight in their conclusion that the new Python library has the potential to accelerate and expand microscopy development. I agree with this statement since classes and methods do not need to be written in Python from scratch anymore. However, I would recommend that the authors include in their conclusion the value of the library for reproducibility if the final python acquisition code is shared along with publications. Nowadays, scientists frequently write in their publications that LabView or a specific commercial scope's acquisition software was used without any acquisition code. Python-Microscope would have the potential to change this trend, and the authors need to stress this aspect and its value for reproducibility in science accordingly. This is a good point. We have added the following to the discussion section.

      “A further advantage of the approach provided by Microscope is in increasing reproducibility in science. Scientists frequently write in their publications that LabView or a specific commercial scope's acquisition software was used without any specific acquisition settings, code or macros to assist with reproduction. This is especially critical in complex experimental setups where specifics of acquisition are particularly important. Microscope has the potential to change this trend, allowing authors to freely publish simple code demonstrating exactly how their control and acquisition operates. Additionally, the defined device interfaces allow such code to be ported to other specific hardware with minimal changes.”

      The authors need to provide a more comprehensive overview of the currently used data acquisition strategies in their introduction. Currently, they highlight the acquisition software provided by vendors for data acquisition (mainly used by life scientists and not necessary scope builders/developers), Micro-Manager (mainly used by life scientists; currently also restricted to wide-field systems), and LabView (for advanced microscope systems; used by advanced developers).

      However, most advanced microscope builders use MatLab (Chmyrov et al. Nature Methods (2013) - https://doi.org/10.1038/nmeth.2556, Ta et al. Nature Communications (2015) - https://doi.org/10.1038/ncomms8977 , etc.), Python (York et al. Nature Methods (2013) - https://doi.org/10.1038/nmeth.2687, etc.), and LabView to write their acquisition software. Since the manuscript focused on advanced microscopes, the authors need to position their library with respect to Matlab and Python's current use as well.

      We thank the reviewer for pointing out the omission of Matlab control solutions and extending the references to other Python based approaches. We have also added a reference to the Pycro-Manager framework for Micro-Manager which has been published since our original submission.

      We have added Matlab to the LabVIEW generalised control software section which now reads:

      “custom control software often in LabVIEW or Matlab, both proprietary software. LabVIEW offers a visual programming environment that is commonly used for building instruments in the physical sciences, whereas Matlab is a programming platform with a focus on numeric computing.”

      And extended the description sections in the introduction with the following paragraphs and references:

      “Matlab is a numerical focused programming environment that scientists are often familiar with for data processing. It has frequently been used for microscopy, leveraging a number of available Matlab sub packages to provide GUI’s and easy access to complex data processing steps. The use of Matlab for microscope control is common in the field but the actual code is rarely shared and often custom to a single microscope setup and associated to image reconstruction (Chmyrov et al., 2013, Ta et al., 2015). Exceptions are ScanImage for the control of laser scanning microscopes (Pologruto et al., 2003), and Matlab Instrument Control (MIC) for the control of individual microscope components (Pallikkuth et al., 2018). Matlab provides a textual programming language simplifying code sharing and version control, however, Matlab is proprietary closed source software and the general requirement of many extensions significantly adds to the cost of implementing many systems.”

      “There is currently an increasing number of software options for microscope control in Python, many of which are in the form of custom scripts specific to a microscope (Alvelid and Testa, 2019, York et al., 2013) but some provide a fully integrated microscope control environments, namely PYME https://www.python-microscopy.org/ for SMLM and ACQ4 (Campagnola et al., 2014) for electrophysiology. While this code is freely available and can be modified, their design around a specific setup, technique, or environment reduces its potential for code reuse in other projects.”

      The authors need to give (a) software provided by vendors, (b) LabView, and (c) Micro-Manager, more credit.

      (a) Several microscope vendors (e.g., Abberior Instruments - https://imspectordocs.readthedocs.io/en/latest/specpy.html ) allow their scopes can be externally controlled to enable the execution of customer-driven acquisition strategies which the vendor's acquisition software itself might not have implemented with. The authors might want to include that scope vendors aim for more customer modifiable acquisition software.

      The reviewer makes a good point, especially in the fact that a number of microscope vendors provide Python interfaces for their systems. We have added the following text:

      Several microscope vendors, such as Abberior Instruments and Zeiss, provide Python interfaces to enable instrument control from Python. These are all very useful additions to proprietary systems, however they have a fundamental draw back that each manufacture produces their own abstractions meaning code from one system is not compatible with another. Although these interfaces leverage the substantial Python infrastructure they are not generalisable and hence fail to enhance portability or reproducibility.

      The fact that these companies are providing Python interfaces to their instruments indicates the general interest of the community in Python as a programming language to extend hardware capabilities. This demonstrates the potential benefit of an entirely Python based interface to a wide range of hardware.

      (b) The authors criticize that LabView code can be hard to understand, reproduce and maintain. However, similar to writing good code in general, there are best practice strategies for writing good LabView code to ensure scalability, readability, and maintainability available as well (https://learn.ni.com/learning-paths/labview-core-3-2016-english ). The primary problem might lie more on the side of lousy coding practice than on LabView's side to perform appropriately.

      This is a fair point and we have revised the manuscript as indicated below. However, it remains true that it is much harder for a non-expert to write high quality code in LabView than in Python. This is particularly evident in complex systems.

      We have changed the section about LabView to read:

      “The visual nature of the programming environment makes simple projects easy but systems with a large number of hardware components or complicated control architecture can become hard to understand, reproduce, and maintain. Although this complication can be reduced with good programming practices, it is not uncommon to outsource such work to a commercial company \citep{chhetri2020software} because good code writing in LabView is significantly more challenging than in popular general purpose languages such as Python. Additionally, the LabView work flow does not integrate well into modern distributed source control infrastructure such as mercurial or git, a necessity for modern open source development.”

      (c) The authors should include the current effort by Pinkard et al. (Pinkard et al. Nature Methods (2021) - https://doi.org/10.1038/s41592-021-01087-6 ) in their discussion.

      A pre-print version of this paper was available on arXiv and cited in our original submission. Now this paper is published we have included the published reference and the following text has been added to our discussion section.

      “As mentioned in the introduction, micromanager has a recently introduced Python interface, Pycro-Manager (Pinkard et al. 2021). This simplifies connections between micromanager based hardware interfaces and Python based analysis and control. Although this reduces the effort in using Python for control and online analysis compared to other approaches it does not provide direct access to the hardware via Python. This interface keeps the existing micromanager infrastructure. Particularly new hardware interfaces still need code in both C/C++ and Java before they are accessible via the Python interface.”

      The authors might want to explain how they plan to facilitate the library's adoption and the long-term maintenance within the microscopy community. Do they plan to create a new category on Image.sc, which would allow the community to interact with the developers? etc. Furthermore, who will keep writing wrappers to the libraries provided by the vendors? etc

      This is a critical point, as the reviewer states, community involvement is essential to continuation of the project and provide a useful tool going forward. We have already published several systems utilising this software platform and are working hard to expand its user base. We have asked for people to post question on the image.sc forums (https://image.sc/) and we also interact with developers and users on the github issue pages (https://github.com/python-microscope/microscope/issues). We have recently implemented a fully automated microscope on a simple motorized stand from Zaber. This provides a fully automated microscopy solution for a very low cost.

      We have edited the end of the discussion to read

      Microscope is a free and open source project currently being used in several labs with an open development approach. Our aim is that the microscope development community will find it a useful tool and engage in this development to increase its general usefulness. With that aim in mind, we perform our development conversations and user support in the open as github issues and the project is an image.sc community partner. In particular, expanding the number of devices supported by Microscope would be extremely beneficial. However, adding support for a device requires physical access to the device and the current list of supported devices echoes the devices we and our collaborators have access to. This is a chicken and egg problem. Python-Microscope needs broad device support to be widely adopted by the community but it needs contributions from the community to support those devices. We believe that, Microscope currently provides enough devices and infrastructure to support adoption by more developers. There are contribution guidelines within the ``Get Involved' section of the documentation, available online at https://www.python-microscope.org/doc/get-involved.

      The authors stress using their library for complex scopes but do not provide an example of complex implementation (they only provide paper references). Only a code for a simple time-series is provided. It would be very beneficial to provide the code for implementing a complex microscope and its GUI with the author's library as separate figures or in the paper's supplement. This would also support point 1 in the review.

      The GUI elements provided by Python-Microscope are deliberately minimal implementations to allow basic connectivity and functionality of specific hardware to be tested. Python-Microscope is specifically designed to provide a hardware interface layer separate from the user interface. We provide a very simple examples to demonstrate how easily devices can be controlled. For more complete examples we have developed two associated packages providing GUIs, both are referenced in the text, BeamDelta is an optical alignment tool, while Microscope-Cockpit provides a full user interface to complex microscope systems. We have added a supplemental figure demonstrating the full GUI provided by Cockpit.

      **Minor comments:**

      It would help the paper if several phrases would be changed: Title: 'Python-Microscope: High-performance control of arbitrarily complex and scalable bespoke microscopes." To: e.g., Python-Microscope: A new open-sources Python library for the control of microscopes

      Why? The authors use the word "high-performance" to address their Python library's trigger feature within the text. Unfortunately, that is not how most people would use the term for. Therefore, it should be avoided not only in the title but throughout the text. Furthermore, the word "complex" combined with microscopes should be avoided. A complex microscope is, for most microscope builders, a microscope that needs precise times and synchronization, includes several feedback active feedback loops, incorporates several devices, is very stable, etc. The context in which the term "complex microscopes" is used here is when the authors talk about the library's features to connect devices to servers either locally or remotely. I agree that the library can connect devices over arbitrary complex networks, but using the term "arbitrary complex microscopes" would be misleading considering the library's current speed limitations, the limited number of currently integrated devices, etc.

      We have changed the title to:

      Python-Microscope: A new open-source library for the control of microscopes

      1. Various section titles: "Library features" would be more suitable than "Use Cases" since the individual new features at the new library are described in this section. Also, the description of the individual features should be mentioned more accurately. The following list might be a better, more accurate fit: (1) "Device modularity" instead of "Device independence."

      Also, the current title "Write once, run with any device" is inaccurate since the wrapper for multiple devices has not been implemented. (2) "Experiment- and scope-specific layout" instead of "Experiments as programs." (3) "Complex network integration" instead of "Easy implementation of complex systems and scalability" (see reasoning under point a). (4) "Hardware and software trigger integration" instead of "High performance, " (5) "Developer-friendly programming features" instead of "Simple development tool."

      We have renamed the specified sections and subsections title and expanded the description in the list of use cases to be more accurate.

      1. The authors should avoid using the term "Microscope" when talking about "Python-Microscope." It facilitates the manuscript's readability since it is occasionally not evident in the paper if they refer to the library or a microscope. We have changed “Microscope” to “Python-Microscope” in multiple places of the manuscript where it was unclear whether we were referring to the software or to a physical microscope.

      2. The authors should avoid the phrase "pythonic software platform" in the abstract since Python-Microscope is a library / Python package and not a software platform. Furthermore, the term "pythonic" describes the desired way to write Python code. It means code that does not just get the syntax right but follows the Python community conventions and uses the language in the way it is intended to be used. Instead, it might be advisable to write, "Python-Microscope offers elegant Python-based tools to control microscopes...". We have changed the abstract as suggested.

      Figure 1 should be supported by comments, e.g., #Load packages, #Parameter Initialization, #Create Devices, # Set camera parameters, etc.

      Comments have been added the sample code.

      The paragraph under the section "Experiments as programs" about the advantages of using Python (starting from "We have developed the software in Python, ...") should be moved into the Introduction section.

      We have moved this segment to the end of the introduction.

      Reviewer #2:

      1)The introduction does a good job describing the current situation (using multiple software from multiple vendors simultaneously, Micro-Manager, Labview), although it could be highlighted a bit more that several groups have created custom Python code for microscope control (such as https://github.com/ZhuangLab/storm-control, https://github.com/Ulm-IQO/qudi, https://github.com/fedebarabas/tormenta, https://github.com/AndrewGYork/tools), some with at least the hope that their code will be generally usable. It also could be noted that the Micro-Manager device abstraction layer has been accessible from Python for more than a decade (currently the Python 3 interface is at https://github.com/micro-manager/pymmcore).

      We have significantly expanded the references to previous Python code and made other changes to the relevant sections as detailed in the response to reviewer #1 and quoted below. We have made reference to the recently published Pycro-Manager package (the previous version referenced the arXiv preprint of this paper. It should be noted that although the Python bindings for mmcore have been available for more than a decade, they have been rarely used, the only published paper referencing them appears to be the whitepaper from a workshop on microscope control software published on arXiv in 2020 (https://arxiv.org/abs/2005.00082).

      “There is currently an increasing number of software options for microscope control in Python, many of which are in the form of custom scripts specific to a microscope (Alvelid and Testa, 2019, York et al., 2013) but some provide a fully integrated microscope control environments, namely PYME https://www.python-microscopy.org/ for SMLM and ACQ4 (Campagnola et al., 2014) for electrophysiology. While this code is freely available and can be modified, their design around a specific setup, technique, or environment reduces its potential for code reuse in other projects”

      2) Manuscripts describing software tools have to balance the goal to "announce" and advertise the software package with the goal to objectively explain the design principles and choices made. In my opinion, this manuscript finds a nice balance, and leaves the reader with a decent understanding of the capabilities, advantages, limitations and high level architecture of the Python-Microscope package. Possible exceptions are the use of the word "elegant" in the abstract, and extensive use of the word "bespoke" that I mainly know from real estate agent language and that likely is confusing to many readers for whom English is a second language.

      We have reworded the abstract to say

      “Python-Microscope offers simple to use Python-based tools to control microscopes…”

      We use the term “bespoke” to refer to the construction of novel optical microscopes, as opposed to controlling existing integrated systems from commercial vendors. We have reworded paper to refer to custom built microscopes and optical systems to clarify this point.

      As far as I am aware, "Microscope" is the most developed microscope abstraction layer written in pure Python. Remarkably, its design (device classes that inherit from a device-base class and have their own function calls, supplemented with "Settings" that can be declared by each device), is extremely similar to that of the Micro-Manager device abstraction layer (where "Settings" are called "Properties"), with the main difference being that one is written in Python and the other in C++ with C bindings. Writing these device classes in Python hopefully brings the advantage that more people can write them, however, the Micro-Manager C interface has the advantage that it can be used from any programming language on any platform, hence is more future proof than pure Python code. The downside of having multiple microscope device abstraction layers is that resources will be diluted and confuse partners in industry (which toolkit should they support with their limited resources?). The number of devices supported is currently much, much greater in the Micro-Manager platform than in Microscope, and a translation layer to make Micro-Manager device adapters in Microscope does not seem out of the question, and could possibly benefit many.

      We are aware of the similarity between our approach and that in micromanager. There is therefore significant overlap and possible duplication of effort, however when we started this project we reviewed the Python bindings of micromanager core and felt that using this approach would add significantly, not only to our development effort, but also to end user effort as they would also have to install Micro Manager and its Python bindings. In addition, we believe that there is significant value in having a pure Python implementation. As the reviewer suggests "Python is at the moment probably the most widely used computer programming language by scientists". Having Python-Microscope in a language that the end user can code, invites them to look into the “box” and eases the process for these, possible casual, Python users to contribute with fixes and support for new devices.

      Reviewer #3:

      • I miss more information regarding the latency of the device-server and software triggering, how fast can it be? How much delay would you have between computers/devices? For example, could we have the devices sincronized at the microsecond range? I think this is super important so that the reader knows if it's worth using a software triggering approach with Python-Microscope or they should buy a DAQ instead. We generally expect high performance hardware to require hardware triggering, software triggers are very unlikely to be performant, or reliable enough to achieve ms, yet alone, µs timing accuracy and reproducibility. Software triggering is implemented as a basic approach to allow simple low speed hardware control, such as basic image snapping. Our systems all utilise external timing devices to provide digital triggering and, in some cases, analogue voltage control. This is becoming increasingly easy with high performance microprocessors such as the ardiuno or higher spec solutions such as National Instruments DAQ boards. We are currently investigating the recently released Raspberry Pi Pico boards, which provide very high performance digital triggering at very low cost (~£4). We are passionately promoting open source, low cost solutions, so requiring a NI DAQ board and LabView licenses goes against the spirit of this project.

      1b) It's good though that they don't want to limit themselves to software triggering but also mention hardware triggering, but it's important to better explain where are the limitations.

      This is a significant issue but we feel it is beyond the scope of this paper. We utilise microscope as a low level interface to hardware for our systems. The hardware control software has no internal knowledge of device connectivity eg which filter wheel might be in front of what camera, so any integrated control, such as synchronising light sources and cameras is beyond the scope of this package. We use the cockpit package as a GUI and to provide this higher level control integration. We then utilise hardware timing devices interfaced to cockpit to run experiments. We feel that this is a relatively cheap and approachable solution while allowing high performance from even complex systems.

      1c) Needs info adding to the text, but in general python-microscope doesn’t concern itself with this, just allows setting of trigger types and you are then responsible for triggering.

      As suggested by the reviewer, Python-Microscope does not generally concern itself with triggering. It allows setting of trigger types in a consistent manner, and on relevant devices can initiate a software trigger event. The end of the section “Fast and furious” now reads:

      “The microscope interface was designed with the concept of triggers that activate the individual devices and software triggers are handled as simply another trigger type. This approach provides an interface that supports software triggers but is easily upgraded to hardware triggers. The source of such hardware triggers can be other devices --- typically a camera --- or a dedicated triggering device. The recommended procedure is to prepare an experiment template that is then loaded on a dedicated timing device which triggers all other devices, as described in Carlton 2010.

      The existence of fast and cheap microprocessors and single board computers mean providing a dedicated hardware timing to sequence and synchronise a number of devices is relatively easy and extremely cost effective. We would recommend systems are designed around using an external device to provide hardware triggers to devices. This provides reliable timing and much more flexible sequencing than directly connecting outputs from one device to trigger inputs for another.”

      1d) I also miss information about the triggering, do the software offer a platform that can synchronize devices, or that's more left to the developer to do? They say they can generalize to arbitrarily complex devices so therefore I think it needs to be specified how. Same with the server feature, how fast is that link?

      The software triggering depends very much upon the individual devices and delays such as context switching within the OS. We offer no solution to synchronise devices. Our claim to generalise to arbitrarily complex systems is based on the fact that you can trivially run devices on different computers to allow horizontal scaling. If you wish to have 25 cameras, simply run them on different computers, then none will be speed limited by computational resources. Synchronisation can be achieved by an external hardware timing device as described above.

      The server link is passed over standard ethernet, likely now 1GB/s, however data packets must be serialised before transmission and deserialised on receipt by Python, as well as standard network overhead and latency. We have only seen network limitations on image transfer from cameras to remote server computers. This has not been a significant issue as the cameras drivers typically have memory buffers, which can be enlarged to cope with backlogs, as well as the Python-Microscope image transmission processes acting on a FIFO memory queue. Possibly long experiments utilising fast, high pixel count cameras could saturate these buffers, but such a specialised application could use specialised solutions such as multi-path networking or a computer with a very large amount of RAM for temporary buffering.

      2a) Some critical comments are that, first of all there are not so many drivers yet available (for example Hamamatsu camera).

      The reviewer is correct, device support is critical. There are two components to this, a) the resources to implement new devices, and b) the physical hardware to enable testing and debugging of these devices. We have focused on the hardware that we own and use but hardware support is expanding. As described in our reply to reviewer #1, we hope that a community of experienced hardware and software developers will evolve and help support new devices. We have instructions on how to support new hardware devices and are happy to help interested parties. We also plan to apply for continuing funding to enable us to further develop Python-Microscope, especially to expand its range of supported hardware,

      The well defined interface with the abstract base class in Python enforces what is required for a minimal implementation of a specific device type. Most devices are relatively easily supported by reference to existing devices of the same type. For instance, a stage is likely to be communicated to by serial over USB, taking simple text commands and returning easy to interpret responses. Adding a new device simply involves defining what commands to send and how to deal with the replies from the hardware. With a suitable manual this can typically be done with a few hours of programming and testing.

      2b) I guess this paper is also to show proof of concept and then upon interest they will include more devices, but in that case it should be more documented how one can contribute to the project and generate new drivers. For example, if we want to try it tomorrow in our setups, and we have a specific device such as an Hamamatsu camera, What should we do? Should we contact the authors, write an issue in the github page or write the driver ourself?

      We have added the following paragraph on contributing to the project at the end of discussion section of the paper:

      Microscope is a free and open source project currently being used in several labs with an open development approach. Our aim is that the microscope development community will find it a useful tool and engage in this development to increase its general usefulness. With that aim in mind, we perform our development conversations and user support in the open as github issues and the project is an image.sc community partner. In particular, expanding the number of devices supported by Microscope would be extremely beneficial. However, adding support for a device requires physical access to the device and the current list of supported devices echoes the devices we and our collaborators have access to. This is a chicken and egg problem. Python-Microscope needs broad device support to be widely adopted by the community but it needs contributions from the community to support those devices. We believe that, Microscope currently provides enough devices and infrastructure to support adoption by more developers. There are contribution guidelines within the ``Get Involved' section of the documentation, available online at https://www.python-microscope.org/doc/get-involved.

      • Second, the graphical interface is maybe good enough for developers and builders but in order to have a solid microscope that biologists are going to use it needs a bit more work in that direction. The GUI in microscope is extremely basic and designed for quick testing. For a microscope system aimed at biological users we would recommend using Microscope-Cockpit, our paper is now referenced and a supplemental figure shows an example of its interface, or implementing an alternative more specialised GUI. We have released Python-Microscope as a separate package to separate low level hardware control from a GUI front end, enable relatively easy automated control of microscope systems directly from Python, or allow others to create GUI base interfaces without having to deal with interfacing to specific hardware.
    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Pinto et al present a new python based software to control microscopes. Overall the work is very interesting and will help microscopists to accelerate their development by providing new tool to integrate the different hardwares.

      A few aspects commented below need to be clarified to help potential future users to integrate the software for the correct microscopes/hardware.

      In general the software is mostly targeted to developers that want to build microscopes, as they mention in the discussion. Some positive features are (1) the ability to have experiments as scripts, (2) the software triggering, (3) the device-server structure, and (4) the ability to have virtual devices to try out the code and the testing I see in the github page. I think it's robust especially and mostly for the device-layer of the software. It's also positive that one can install it in python and import it in your programs, so it can be incorporated into other software fairly easy.

      I miss more information regarding the latency of the device-server and software triggering, how fast can it be? How much delay would you have between computers/devices? For example, could we have the devices sincronized at the microsecond range? I think this is super important so that the reader knows if it's worth using a software triggering approach with Python-Microscope or they should buy a DAQ instead. It's good though that they don't want to limit themselves to software triggering but also mention hardware triggering, but it's important to better explain where are the limitations.

      I also miss information about the triggering, do the software offer a platform that can synchronize devices, or that's more left to the developer to do? They say they can generalize to arbitrarily complex devices so therefore I think it needs to be specified how. Same with the server feature, how fast is that link?

      Some critical comments are that, first of all there are not so many drivers yet available (for example Hamamatsu camera). I guess this paper is also to show proof of concept and then upon interest they will include more devices, but in that case it should be more documented how one can contribute to the project and generate new drivers. For example, if we want to try it tomorrow in our setups, and we have a specific device such as an Hamamatsu camera, What should we do? Should we contact the authors, write an issue in the github page or write the driver ourself?

      Second, the graphical interface is maybe good enough for developers and builders but in order to have a solid microscope that biologists are going to use it needs a bit more work in that direction.

      Significance

      Microscope control software, especially is open source, can help the rapid integration of new hardware and accelerate overall microscopy development.

      I see this paper as an important starting point platform for future more user friendly Python-microscope controlling software.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      This manuscript describes Python-Microscope, a library/framework written in Python to control custom-built microscopes. Modern light microscopes consist of many computer controllable components and data sensors, and software has become an integral component of such systems. Microscopy is such a fast moving and diverse technology that a significant (>25%?) fraction of microscope systems can not be cookie-cutter, standardized systems, but are custom-built, assembled using commercial microscope stands and/or hardware from vendors such as Thorlabs. For many, creating the software to control such custom-built systems is more laborious and difficult than building the actual optical setup, and software toolkits to make this easier (such as the one presented in this manuscript) are of great interest to everyone working in this area. Python is at the moment probably the most widely used computer programming language by scientists, and a well-thought-out environment for microscope control from the Python language is a welcome addition.

      The introduction does a good job describing the current situation (using multiple software from multiple vendors simultaneously, Micro-Manager, Labview), although it could be highlighted a bit more that several groups have created custom Python code for microscope control (such as https://github.com/ZhuangLab/storm-control, https://github.com/Ulm-IQO/qudi, https://github.com/fedebarabas/tormenta, https://github.com/AndrewGYork/tools), some with at least the hope that their code will be generally usable. It also could be noted that the Micro-Manager device abstraction layer has been accessible from Python for more than a decade (currently the Python 3 interface is at https://github.com/micro-manager/pymmcore).

      Manuscripts describing software tools have to balance the goal to "announce" and advertise the software package with the goal to objectively explain the design principles and choices made. In my opinion, this manuscript finds a nice balance, and leaves the reader with a decent understanding of the capabilities, advantages, limitations and high level architecture of the Python-Microscope package. Possible exceptions are the use of the word "elegant" in the abstract, and extensive use of the word "bespoke" that I mainly know from real estate agent language and that likely is confusing to many readers for whom English is a second language.

      As far as I am aware, "Microscope" is the most developed microscope abstraction layer written in pure Python. Remarkably, its design (device classes that inherit from a device-base class and have their own function calls, supplemented with "Settings" that can be declared by each device), is extremely similar to that of the Micro-Manager device abstraction layer (where "Settings" are called "Properties"), with the main difference being that one is written in Python and the other in C++ with C bindings. Writing these device classes in Python hopefully brings the advantage that more people can write them, however, the Micro-Manager C interface has the advantage that it can be used from any programming language on any platform, hence is more future proof than pure Python code. The downside of having multiple microscope device abstraction layers is that resources will be diluted and confuse partners in industry (which toolkit should they support with their limited resources?). The number of devices supported is currently much, much greater in the Micro-Manager platform than in Microscope, and a translation layer to make Micro-Manager device adapters in Microscope does not seem out of the question, and could possibly benefit many.

      Expected audience:

      This manuscript will be of interest to those scientists who build/assemble their own microscope systems and write software code to control their operation.

      Field of expertise:

      I think a lot about microscope control software and how it can help scientists do their experiments.

      Significance

      see above.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary:

      In this manuscript, Pinto et al. report Python-Microscope, a new open-source Python library for microscopy control. The new library lets microscope builders implement individual microscope devices as Python Classes with devices specific parameters and methods. Furthermore, the new Python library supports remote procedural calls and turns individual devices into a resource accessible over a network. Moreover, it has been designed to support hardware as well as software triggers. Finally, it provides several developer-friendly features; it is equipped with simple GUI programs for different device types, and it can simulate devices without the need for physical access to the hardware.

      Major comments:

      1. The authors highlight in their conclusion that the new Python library has the potential to accelerate and expand microscopy development. I agree with this statement since classes and methods do not need to be written in Python from scratch anymore. However, I would recommend that the authors include in their conclusion the value of the library for reproducibility if the final python acquisition code is shared along with publications. Nowadays, scientists frequently write in their publications that LabView or a specific commercial scope's acquisition software was used without any acquisition code. Python-Microscope would have the potential to change this trend, and the authors need to stress this aspect and its value for reproducibility in science accordingly.
      2. The authors need to provide a more comprehensive overview of the currently used data acquisition strategies in their introduction. Currently, they highlight the acquisition software provided by vendors for data acquisition (mainly used by life scientists and not necessary scope builders/developers), Micro-Manager (mainly used by life scientists; currently also restricted to wide-field systems), and LabView (for advanced microscope systems; used by advanced developers). However, most advanced microscope builders use MatLab (Chmyrov et al. Nature Methods (2013) - https://doi.org/10.1038/nmeth.2556, Ta et al. Nature Communications (2015) - https://doi.org/10.1038/ncomms8977 , etc.), Python (York et al. Nature Methods (2013) - https://doi.org/10.1038/nmeth.2687, etc.), and LabView to write their acquisition software. Since the manuscript focused on advanced microscopes, the authors need to position their library with respect to Matlab and Python's current use as well.
      3. The authors need to give (1) software provided by vendors, (2) LabView, and (2) Micro-Manager, more credit. (1) Several microscope vendors (e.g., Abberior Instruments - https://imspectordocs.readthedocs.io/en/latest/specpy.html ) allow their scopes can be externally controlled to enable the execution of customer-driven acquisition strategies which the vendor's acquisition software itself might not have implemented with. The authors might want to include that scope vendors aim for more customer modifiable acquisition software. (2) The authors criticize that LabView code can be hard to understand, reproduce and maintain. However, similar to writing good code in general, there are best practice strategies for writing good LabView code to ensure scalability, readability, and maintainability available as well (https://learn.ni.com/learning-paths/labview-core-3-2016-english ). The primary problem might lie more on the side of lousy coding practice than on LabView's side to perform appropriately. (3) The authors should include the current effort by Pinkard et al. (Pinkard et al. Nature Methods (2021) - https://doi.org/10.1038/s41592-021-01087-6 ) in their discussion.
      4. The authors might want to explain how they plan to facilitate the library's adoption and the long-term maintenance within the microscopy community. Do they plan to create a new category on Image.sc, which would allow the community to interact with the developers? etc. Furthermore, who will keep writing wrappers to the libraries provided by the vendors? etc Several useful software packages have been written in the past, but their existence was often not for long (after 2-3 years, most packages simply can not be used anymore). The concept of software maintenance is frequently not addressed/considered. Therefore, could the authors expand this aspect in an additional section of their paper?
      5. The authors stress using their library for complex scopes but do not provide an example of complex implementation (they only provide paper references). Only a code for a simple time-series is provided. It would be very beneficial to provide the code for implementing a complex microscope and its GUI with the author's library as separate figures or in the paper's supplement. This would also support point 1 in the review.

      Minor comments:

      1. It would help the paper if several phrases would be changed: a. Title: 'Python-Microscope: High-performance control of arbitrarily complex and scalable bespoke microscopes." To: e.g., Python-Microscope: A new open-sources Python library for the control of microscopes Why? The authors use the word "high-performance" to address their Python library's trigger feature within the text. Unfortunately, that is not how most people would use the term for. Therefore, it should be avoided not only in the title but throughout the text. Furthermore, the word "complex" combined with microscopes should be avoided. A complex microscope is, for most microscope builders, a microscope that needs precise times and synchronization, includes several feedback active feedback loops, incorporates several devices, is very stable, etc. The context in which the term "complex microscopes" is used here is when the authors talk about the library's features to connect devices to servers either locally or remotely. I agree that the library can connect devices over arbitrary complex networks, but using the term "arbitrary complex microscopes" would be misleading considering the library's current speed limitations, the limited number of currently integrated devices, etc. b. Various section titles: "Libraray features" would be more suitable than "Use Cases" since the individual new features at the new library are described in this section. Also, the description of the individual features should be mentioned more accurately. The following list might be a better, more accurate fit: (1) "Device modularity" instead of "Device independence." Also, the current title "Write once, run with any device" is inaccurate since the wrapper for multiple devices has not been implemented. (2) "Experiment- and scope-specific layout" instead of "Experiments as programs." (3) "Complex network integration" instead of "Easy implementation of complex systems and scalability" (see reasoning under point a.) (4) "Hardware and software trigger integration" instead of "High performance, " (5) "Developer-friendly programming features" instead of "Simple development tool." c. The authors should avoid using the term "Microscope" when talking about "Python-Microscope." It facilitates the manuscript's readability since it is occasionally not evident in the paper if they refer to the library or a microscope. d. The authors should avoid the phrase "pythonic software platform" in the abstract since Python-Microscope is a library / Python package and not a software platform. Furthermore, the term "pythonic" describes the desired way to write Python code. It means code that does not just get the syntax right but follows the Python community conventions and uses the language in the way it is intended to be used. Instead, it might be advisable to write, "Python-Microscope offers elegant Python-based tools to control microscopes...".
      2. Figure 1 should be supported by comments, e.g., #Load packages, #Parameter Initialization, #Create Devices, # Set camera parameters, etc.
      3. The paragraph under the section "Experiments as programs" about the advantages of using Python (starting from "We have developed the software in Python, ...") should be moved into the Introduction section.

      Significance

      The field of microscopy emphasizes more and more openness and transparency of methods and tools being used to accelerate science, but also to guarantee reproducibility.

      The authors' library is another step in the right direction. It is open, transparent, tries to satisfy multiple tool developers' needs to make the development of microscopes faster, easier, and more approachable/user-friendly. Although it can not yet be used for arbitrarily complex microscopes, it has the potential to do so in the future. For now, the authors need to manage to incorporate and involve microscopy developers' needs and requirements in the best possible way to be able to design the library as holistic as possible.

      I am a physicist and microscope builder and have so far used MatLab, LabView, and Imspector as well as Python scripts to control microscopes, and I will definitely test the authors' library on my own.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      We thank the reviewers for their constructive and critical feedback on our original manuscript.

      Reviewer #1 (Evidence, reproducibility and clarity (Required)): In this study, the authors explored the tissue-specific regulation of DT size using both global and targeted deletion of Fgf9. They found cell hypertrophy and mineralization dynamics of the DT, as well as transcriptional signatures from skeletal muscle but not bone, were influenced by the global loss of Fgf9. Deletion of Fgf9 in skeletal muscle leads to postnatal enlargement of the DT. However, the innovation of this paper is not enough, the phenotypes of global deletion of Fgf9 were previously reported, most of the data in this paper are mainly descriptive analysis of the phenotypes, and internal cellular and molecular mechanisms were not well investigated.

      Here are the major issues:

      1.The data showed that fewer osteoclasts were present at both E16.5 and P0 in Figure 2R, V. Whether FGF9 affects both osteogenesis and osteoclast formation?

      • Authors’ response to Reviewer: Thank you for your feedback. We revised this manuscript to reflect the concerns of Reviewer 1 related to the lack of cellular and molecular mechanisms as described below. **Based on this question from the Reviewer, we have revised our discussion to clarify our findings as follows: “From our EdU proliferation assays, we observed a decline in cell proliferation in Fgf9null attachments, suggesting an accelerated chondrocyte maturation. Though we saw similar levels of Pthlh expression (a chondrocyte hypertrophy suppressor) in both WT and Fgf9null attachments, we also saw increased expression of Gli1 (a marker of chondrocyte hypertrophy) localized to the attachment in Fgf9null embryos compared to WT embryos. This decrease in proliferation was in parallel with increased hypertrophy of chondrocytes adjacent to the attachment cells within the Fgf9null DT, which may have led to a rapid expansion of matrix in the DT. Even though the DT was enlarged in Fgf9null mutants, we found fewer Sost+ cell clusters in their DTs compared to WT mice. Mature osteocytes express Sost (Winkler et al., 2003), and fewer Sost+ cells may indicate an impaired ability of Fgf9null osteoblasts to embed and mature into osteocytes. Overexpression of FGF9 in the perichondrium has been previously shown to suppress chondrocyte proliferation and limit bone growth in the limb (Karuppaiah et al., 2016); in our study, we found that loss of Fgf9 globally leads to an accelerated enlargement of chondrocytes in the tuberosity. This accelerated enlargement may limit the ability of these cells to deposit matrix and mineral and therefore limit osteocyte differentiation. We also found fewer osteoclasts in the Fgf9null DT which mirrors previous reports using the same mutation to study the length and vascularity of developing limb (Hung et al., 2007). Because the DT is enlarged and resides on the surface of a shortened bone, this phenotype may elucidate a divergent role of FGF9 in patterning of an arrested (e.g., attachment) growth plate compared to a regular (e.g., long bone) growth plate. This includes unexplored roles of FGF9 in vascularity of the tendon attachment and formation of bone ridges that overlap with or deviate from its role in growth plate development that are beyond the scope of the current study.”
      1. RNA-sequencing analysis showed the decreased expression of mitochondria/ energy and lipid associated genes in Fgf9 null muscle compared to WT muscle, how does this relate to the enlargement of the DT? What are the detailed molecular mechanisms?
      • Authors’ response to Reviewer:
      • Based on this question from the Reviewer, we have revised our discussion to reflect the potential molecular mechanisms related to muscle mitochondria, fiber type, and metabolism as follows:

      “Fgf9 is expressed in muscle during embryonic stages, which we and others have observed using ISH (Colvin et al., 1999; Garofalo et al., 1999; Hung et al., 2007; Yang and Kozin, 2009). Previous work has established a connection between Fgf9 and muscle, as treatment of muscle and muscle progenitor cells with FGF9 slows maturation, enhances proliferation, and decreases expression of various myogenic genes (Huang et al., 2019). This study found supporting evidence that Fgf9 expression in muscle may be a limiting factor in tuberosity growth. However, it remains unknown how other FGFs and their receptors, FGFRs, regulate superstructure and attachment formation. In this study, we identified potential mediators of skeletal muscle metabolism in Fgf9null muscle, including downregulated mitochondrial-related genes associated with oxidative respiration and proton transport (i.e., Slc36a2 and Ucp1, amongst others). In cultured myoblasts, FGF9 can inhibit myogenic differentiation potentially via increased production of Myostatin (Huang et al., 2019), a well-established mediator of fast glycolytic muscle fibers (Girgenrath et al., 2005; Hennebry et al., 2009). While the role of FGF9 in myoblast fusion has been investigated in vitro, its effect on muscle fiber type and fiber metabolism (i.e., oxidative vs. glycolytic) has not yet been explored. Our findings from bulk RNA-seq of Fgf9null muscle point to potential mechanisms in muscle metabolism that may contribute to the enlarged phenotype that is mimetic of that found in Myostatin deficient mice and other animals (Elkasrawy and Hamrick, 2010; Hamrick et al., 2002). Additionally, further investigations are needed to investigate the potential role of Fgf9 in mitochondrial function and lipid metabolism. Recent work by Huang et al. also identified FGF9 as a potent regulator of calcium signaling and homeostasis in myoblast culture in vitro, and calcium release from the sarcoplasmic reticulum in muscle plays a critical role during embryonic skeletal myogenesis via ryanodine receptor 1 (RYR1). Although Ryr1 was not significantly different in between Fgf9null and WT muscle in the present study, we did find that calmodulin-associated genes (e.g., Calm4, Calml3, Camsap3, Calm5) were all significantly upregulated in Fgf9null muscle compared to WT muscle. Calmodulin interacts with RYR1 and its activation is required for intracellular binding of calcium (Newman et al., 2014, 1). Calmodulin is a crucial component of the calcium signal transduction pathway and also plays an important role in lipid and glucose metabolism (Nishizawa et al., 1988). Taken together, our findings along with recent work by Huang et al. support more mechanistic studies to investigate the metabolic effects of loss and gain of function of Fgf9 on skeletal muscle as well as the muscle secretome.”

      Reviewer #1 (Significance (Required)):

      R1 The authors compared the phenotypes between globally and muscle-specifically deletion of Fgf9 in mice, and found that Fgf9 secreted by muscle may induced the enlargement of the DT. However, the detailed molecular mechanisms were not well investigated.

      **Referees cross-commenting**

      R2 I do not disagree with Rev 1, but I do not think such a task is so trial reason why I don't suggest; it could take years to determine molecular mechanisms of anything. The authors could expand the discussion, offer some possibilities. If they had some RNAseq data they maybe could suggest some of the key signaling pathways involved.

      **Referees cross-commenting**

      R1 We still suggested that the internal cellular and molecular mechanisms should be well investigated in this papaer.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      • This paper deals with an important topic which is exact molecular mechanisms regulating the growth of bony tuberosities; because this region is essential for force transmission and movement.
      • Based on the previous information they had that in the global KO of the gene FGF9 the deltoid muscle is enlarged; and this muscle is in a very important tuberosity; they decided to look at FGF9 as a potential genetic regulator.
      • The manuscript is clear, objective, concise. Very clear. Authors used both the global and targeted deletions, very high reproducibility. Reviewer #2 (Significance (Required)):

      • This manuscript advances several areas since we know little about the mechanisms controlling local mechanisms of tuberosities. It also advances our knowledge of FGF9. There were several studies before mostly in vitro showing that FGF9 when added to muscle cells could arrest myogenesis, but the types of experiments in vivo had not been performed yet. The authors used an array of methods; the studies are unbiased and very rigorous and also they always show all experimental points, which is excellent. The conclusions are supported by the data.

      • The main suggestion for authors: They essentially do not discuss the nature of the potential muscle to bone signaling occurring when they target the deletion of FGF9 in skeletal muscles and muscles enlarge and there is a series of adaptions in the tuberosity. Do the authors believe this to be all the genetic changes or potentially through secreted myokines? In the paper of Huang et al, 2019 the authors document an effect of FGF9 in intracellular calcium homeostasis/signaling; could this be part of the mechanism? Perhaps the authors could propose a model?

      Authors’ response to Reviewer:

      • Future studies could investigate the secretome of muscle in Fgf9null or muscle-specific knockouts, as well as assess calcium signaling homeostasis in Fgf9 mutant muscles. We did find calcium- and ion-associated genes in the RNAseq and revised the discussion to include this information.
      • Based on this question from the Reviewer, we have revised our discussion to reflect the potential molecular mechanisms related to muscle mitochondria, fiber type, and metabolism as follows: “Fgf9 is expressed in muscle during embryonic stages, which we and others have observed using ISH (Colvin et al., 1999; Garofalo et al., 1999; Hung et al., 2007; Yang and Kozin, 2009). Previous work has established a connection between Fgf9 and muscle, as treatment of muscle and muscle progenitor cells with FGF9 slows maturation, enhances proliferation, and decreases expression of various myogenic genes (Huang et al., 2019). This study found supporting evidence that Fgf9 expression in muscle may be a limiting factor in tuberosity growth. However, it remains unknown how other FGFs and their receptors, FGFRs, regulate superstructure and attachment formation. In this study, we identified potential mediators of skeletal muscle metabolism in Fgf9null muscle, including downregulated mitochondrial-related genes associated with oxidative respiration and proton transport (i.e., Slc36a2 and Ucp1, amongst others). In cultured myoblasts, FGF9 can inhibit myogenic differentiation potentially via increased production of Myostatin (Huang et al., 2019), a well-established mediator of fast glycolytic muscle fibers (Girgenrath et al., 2005; Hennebry et al., 2009). While the role of FGF9 in myoblast fusion has been investigated in vitro, its effect on muscle fiber type and fiber metabolism (i.e., oxidative vs. glycolytic) has not yet been explored. Our findings from bulk RNA-seq of Fgf9null muscle point to potential mechanisms in muscle metabolism that may contribute to the enlarged phenotype that is mimetic of that found in Myostatin deficient mice and other animals (Elkasrawy and Hamrick, 2010; Hamrick et al., 2002). Additionally, further investigations are needed to investigate the potential role of Fgf9 in mitochondrial function and lipid metabolism. Recent work by Huang et al. also identified FGF9 as a potent regulator of calcium signaling and homeostasis in myoblast culture in vitro, and calcium release from the sarcoplasmic reticulum in muscle plays a critical role during embryonic skeletal myogenesis via ryanodine receptor 1 (RYR1). Although Ryr1 was not significantly different in between Fgf9null and WT muscle in the present study, we did find that calmodulin-associated genes (e.g., Calm4, Calml3, Camsap3, Calm5) were all significantly upregulated in Fgf9null muscle compared to WT muscle. Calmodulin interacts with RYR1 and its activation is required for intracellular binding of calcium (Newman et al., 2014, 1). Calmodulin is a crucial component of the calcium signal transduction pathway and also plays an important role in lipid and glucose metabolism (Nishizawa et al., 1988). Taken together, our findings along with recent work by Huang et al. support more mechanistic studies to investigate the metabolic effects of loss and gain of function of Fgf9 on skeletal muscle as well as the muscle secretome.

      In conclusion, this work established a new role of skeletal muscle derived Fgf9 during skeletal development and tuberosity growth. Additionally, our unbiased transcriptomic approaches and rigorous analyses identified new potential mechanisms associated with muscle development, mitochondrial bioenergetics, and muscle metabolism that warrant further investigation into the role of FGF9 in muscle-bone crosstalk.”

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      This paper deals with an important topic which is exact molecular mechanisms regulating the growth of bony tuberosities; because this region is essential for force transmission and movement. Based on the previous information they had that in the global KO of the gene FGF9 the deltoid muscle is enlarged; and this muscle is in a very important tuberosity; they decided to look at FGF9 as a potential genetic regulator.

      The manuscript is clear, objective, concise. Very clear. Authors used both the global and targeted deletions, very high reproducibility.

      Significance

      This manuscript advances several areas since we know little about the mechanisms controlling local mechanisms of tuberosities. It also advances our knowledge of FGF9. There were several studies before mostly in vitro showing that FGF9 when added to muscle cells could arrest myogenesis, but the types of experiments in vivo had not been performed yet. The authors used an array of methods; the studies are unbiased and very rigorous and also they always show all experimental points, which is excellent. The conclusions are supported by the data.

      The main suggestion for authors: They essentially do not discuss the nature of the potential muscle to bone signaling occurring when they target the deletion of FGF9 in skeletal muscles and muscles enlarge and there is a series of adaptions in the tuberosity. Do the authors believe this to be all the genetic changes or potentially through secreted myokines? In the paper of Huang et al, 2019 the authors document an effect of FGF9 in intracellular calcium homeostasis/signaling; could this be part of the mechanism? Perhaps the authors could propose a model?

      Referees cross-commenting

      I do not disagree with Rev 1, but I do not think such a task is so trial reason why I don't suggest; it could take years to determine molecular mechanisms of anything. The authors could expand the discussion, offer some possibilities. If they had some RNAseq data they maybe could suggest some of the key signaling pathways involved.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      In this study, the authors explored the tissue-specific regulation of DT size using both global and targeted deletion of Fgf9. They found cell hypertrophy and mineralization dynamics of the DT, as well as transcriptional signatures from skeletal muscle but not bone, were influenced by the global loss of Fgf9. Deletion of Fgf9 in skeletal muscle leads to postnatal enlargement of the DT. However, the innovation of this paper is not enough, the phenotypes of global deletion of Fgf9 were previously reported, most of the data in this paper are mainly descriptive analysis of the phenotypes, and internal cellular and molecular mechanisms were not well investigated.

      Here are the major issues:

      1.The data showed that fewer osteoclasts were present at both E16.5 and P0 in Figure 2R, V. Whether FGF9 affects both osteogenesis and osteoclast formation?

      2.RNA-sequencing analysis showed the decreased expression of mitochondria/ energy and lipid associated genes in Fgf9 null muscle compared to WT muscle, how does this relate to the enlargement of the DT? What are the detailed molecular mechanisms?

      Significance

      The authors compared the phenotypes between globally and muscle-specifically deletion of Fgf9 in mice, and found that Fgf9 secreted by muscle may induced the enlargement of the DT. However, the detailed molecular mechanisms were not well investigated.

      Referees cross-commenting

      We still suggested that the internal cellular and molecular mechanisms should be well investigated in this papaer.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Dear Dr. Monaco,

      Thank you for reviewing our manuscript entitled ‘Discovery of re-purposed drugs that slow SARS-CoV-2 replication in human cells’. We are pleased to see that the reviewers make suggestions that will strengthen the paper. With cases of COVID-19 rising at dramatic levels in some parts of the world, we are anxious to see our results published in a peer-review journal.

      Please find below a detailed response to the comments is shown in bold. We can perform the additional experiments and make changes to the manuscript within 3 weeks of a journal agreeing to consider our paper.

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      **Summary:**

      Pickard et al. present in the manuscript entitled "Discovery of re-purposed drugs that slow SARS-CoV-2 replication in human cells" a new screen of FDA approved drugs against SARS-CoV-2. The authors based their screen on Vero and HUH7 cell lines. The methods applied for screening including the SARS-Cov-2-ΔOrf7a-NLuc modified virus are properly designed and preformed. This is an interesting study that finds several potential drugs that might be effective as anti SARS-CoV-2 therapies. However, such experiments have been done throughout the last year and the novelty and importance of these findings are questionable.

      Regarding this point, there are several studies that have attempted to identify compounds that impact on SARS-Cov2 infection; however, these do not specifically focus on the replication of the virus (studies have used viability markers and staining of viral proteins but many of the compounds identified exert their effects on the virus uptake). Whilst SARS-CoV2-Nluc viruses have been developed these have been used for infection studies to measure the amount of virus taken up by cells and have not further explored how they impact on virus replication. Therefore, we feel that our study shows that a reporter virus can be used to reflect virus replication.


      **Major comments:**

      1. Most the experiments presented are only done twice, while in the screen itself it should not be a problem, for verifying the drugs identified at least three experiments are suggested (Figure 5 and Supplemental Figure 6) At the time of submission there was an urgent need to make our data accessible to the scientific community. Therefore, we performed some experiments with n=2. We used n=2 to validate the screen and each time we got the same experimental outcome. We would perform further repeats for the figures mentioned for publication.

      To strengthen the results of the screen, the wild type virus should also be tested for plaque reduction assay with these nine drugs.

      We will perform these experiments and present these data in the manuscript. We have already performed immunostaining of WT-virus infected cells and could include this as an alternative.


      Identification of antivirals is important for SARS-CoV-2 and other coronaviruses, regardless of the presence or effectiveness of vaccines. I think the abstract and introduction should be written to emphasize this point (instead of trying to underestimate the vaccine effectiveness). Similarly, the authors ignore the relative failures of known antivirals (known to inhibit SARS-CoV-2 replication in vitro like Remdesavir) in clinical trials and suggest starting clinical trials with their screen results. I think that this suggestion is premature and require several more studies (including animals studies) before initiating clinical trials.

      We will re-write this section of the manuscript. We have identified all compounds that have been evaluated in the AGILE clinical trial, and these compounds failed to show a patient benefit and also failed to impact on virus replication in human cells.


      **Minor comments.**

      1. The errors bars are not defined throughout all the figures. I am not sure that error bars are even meaningful if experiments only done twice, I recommend showing the two results for each point. We will add additional repeats or as the reviewer suggests we could add the two points.

      Figure 1E and the tables especially supp tables 3 and 4? don't have legends.

      Apologies, this will be amended.


      Most graphs will benefit from presenting the results in logarithmic scale (all Luc counts/ qPCRs).

      This can be changed if editors agree.

      P6 in the Generation of functional SARS-CoV-2 virus section - a reference is missing "It has been reported that this aids the recovery of replicative virus (Insert ref 3)"

      Apologies, this will be amended.

      Reviewer #1 (Significance (Required)):

      This is a well performed drug screen on two cell lines that identified new potential FDA approved drugs as anti-SARS-CoV-2 inhibitors. There are several studies that already been published or distributed as preprints that have done similar experiments in other cells lines including more relevant lung epithelial cells (for example PMC743673). This study does not verify the screen results by additional methods. However, in the current pandemic situation this study could be important and interesting to follow up.

      I am a virologist; my expertise is in viral host interactions within infected cell.

      We were unable to identify the paper which is referred to in the reviewer’s comments. We would aim to highlight further in the text that using the reporter virus, we are able to screen and identify compounds that impact on virus replication unlike many of the other studies.


      **Referee Cross-commenting**

      No problem with the other comments

      Reviewer #2 (Evidence, reproducibility and clarity (Required)): **Summary:**

      In this manuscript, the authors report on the creation of a luciferase-encoding SARS-CoV-2 (deleted for orf7a) and the use of this virus to test infectability of multiple cell lines as well as perform a drug repurposing screen in two cell lines (Vero E6 and Huh7). Of the 35 drugs that blocked the virus replication they further identify 9 drugs that have a (mild) effect on replication when administered 24 hours post infection.

      An important note here is that many studies which have identified potential therapeutics for SARS-Cov2 have performed experiments whereby cells are pre-treated with compounds prior to infection. We have been able to performed the same experiments and many of the drugs were unable to prevent replication after infection. The 9 compounds we have identified retain the ability to inhibit replication when applied post-infection. This sets our study apart from other screens that have been conducted for SARS-Cov2.


      **Major comments:**

      1. Figure 2: What's the difference between "Luminescence counts above noise" in Fig 2B and "Luminescence counts per second" in Figure 2C,D ? It seems like there is no difference in luminescence between 1 PFU and 100 PFU (and if anything, the bassline for 1PFU is higher, >1.5M, compared to 100 PFU where is below 1M). One would expect more luminescence in the 100 PFU experiment, as seen in Fig 2B. Also in Fig 2B it does not mention how many replicates, or what does the **** stands for. Thank you for the comment. The difference in “luminescence counts above noise” and “luminescence counts per second” is set out in Figure 2A. When adding more virus the baseline level should increase, as also demonstrated in Supplemental Figure 3. However, the degree of background luminescence varies between virus batches, presumably due to the degree of cell lysis in each sample. You will note in the Supplemental figure that the baseline levels for our P4 viral stock is lower than P1. We performed the experiments in Figure 2C using virus P1 virus stocks and for Figure 2D we used P4 virus. For clarity this information will be included in the figure legend and the data presented at luminescence over background.

      The authors do not explain why deleting orf7a was needed to generate the NLuc virus. Was there a rational for this?

      Orf7a has been successfully removed from SARS-CoV and SARS-CoV2 in order to incorporate traceable proteins such as fluorescent or bioluminescent proteins. We describe this at the start of the results section. “Orf7a has previously been removed in SARS-CoV and SARS-CoV-2 and yielded infectious and replicative virus particles (Thi Nhu Thao et al, 2020; Xie et al, 2020a; Xie et al, 2020b)”.

      Figure 5C - IC50 should be properly determined from compounds where the lowest concentration tested was still inhibitory (such as LY2835219 and panobinostat).

      These experiments can be conducted, within 2 weeks. However we do not feel that this would provide additional information to the reader. The aim of these figures is to demonstrate that there are dose dependent effects of these compounds on the replication of SARS-CoV2.

      Supplementary tables must be provided in an excel or similar file format. The PDF version is both unreadable and does not allow other researchers to probe the dataset for their own interests.

      This would be amended during revision of our submission.

      **Minor comments:**

      1. Intro: "SARS-CoV-2 infection in patients with COVID-19 can result in pulmonary distress, inflammation, and broad tissue tropism". Broad tissue tropism is not a result of infection, please rephrase. Patients with COVID-19 are reported to have liver and kidney damage. This could be a direct result of SARS-CoV2 infection or indirectly via the cytokine storm. Our data shows that kidney and liver cells are highly susceptible to SARS-CoV2 infection and support replication, in culture. We thank the reviewer for their comment and we will rephrase this statement and cite relevant literature.

      Fig S1D - why are the MOI different for WT (moi 0.1) and NLuc mutant (moi 1) ?

      This was used to demonstrate the lack of replication of the WT virus in lung epithelial cells, the same MOI used in Vero cells demonstrates that the levels of the nucleocapsid protein increases when compared to other cell types. We have also used an MOI of 10 for the NLuc virus to be able to detect the NLuc protein. This information would be added to the figure legend.


      Fig S3 - using volume of virus in ul is problematic, as it doesn't allow for proper comparison between the passages. The author would express the virus amount in PFU or MOI.

      This will be amended


      Fig S5 - in panel A - what do the colors represent? What is 0-1?. The number of repetitions for each panel should be indicated.

      Apologies, relative expression should have been added alongside the scale. N=3 for this experiments this will be added to the figure legend.


      The "NLuc activity as a marker of virus replication" and "SARS-CoV-2 replication screen validation" are largely overlapping and should be edited.

      We would combine these sections.


      Methods: "Generation of functional SARS-CoV-2 virus" - the author confuse "virus" with "plasmid". They should also include the reference marked "(Insert ref 3)"

      Apologies, this will be amended


      Reviewer #2 (Significance (Required)):

      1. My main concern is that a very similar, if not identical, NLuc encoding virus has been reported in October 2020 (https://www.nature.com/articles/s41467-020-19055-7#Sec9). While the authors cite this paper, they only do so to say that "Orf7a has previously been removed in SARS-CoV and SARS-CoV-2 and yielded infectious and replicative virus particles", without mentioning this was done to generate the same NLuc carrying virus reported in their work. Thus the generation of this virus is not a "new tool" as the authors would seem to suggest. Whilst this is not the first use of a NLuc SARS-CoV2 virus, this is the first time that the virus has been utilised to screen for compounds that effect replication. The study mentioned does not screen nor monitor the replication of the virus, the authors do monitor the capability of the virus to infect cells only during the first 24 hours.

      While drug repurposing screens have been performed, the addition validation in Vero E6 and Huh7 cells is of some interest to those working on anti-viral therapies, given that the authors change their supplementary tables to a format that can be accessible by other researchers.

      This will be amended for the submission.


      My expertise: I study virus-host interactions (not coronaviruse). In the last year I have been involved in several drug repurposing efforts against SARS-CoV-2.

      **Referee Cross-commenting**

      No problem with the other comments.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary:

      In this manuscript, the authors report on the creation of a luciferase-encoding SARS-CoV-2 (deleted for orf7a) and the use of this virus to test infectability of multiple cell lines as well as perform a drug repurposing screen in two cell lines (Vero E6 and Huh7). Of the 35 drugs that blocked the virus replication they further identify 9 drugs that have a (mild) effect on replication when administered 24 hours post infection.

      Major comments:

      1. Figure 2: What's the difference between "Luminescence counts above noise" in Fig 2B and "Luminescence counts per second" in Figure 2C,D ? It seems like there is no difference in luminescence between 1 PFU and 100 PFU (and if anything, the bassline for 1PFU is higher, >1.5M, compared to 100 PFU where is below 1M). One would expect more luminescence in the 100 PFU experiment, as seen in Fig 2B. Also in Fig 2B it does not mention how many replicates, or what does the ** stands for.
      2. The authors do not explain why deleting orf7a was needed to generate the NLuc virus. Was there a rational for this?
      3. Figure 5C - IC50 should be properly determined from compounds where the lowest concentration tested was still inhibitory (such as LY2835219 and panobinostat).
      4. Supplementary tables must be provided in an excel or similar file format. The PDF version is both unreadable and does not allow other researchers to probe the dataset for their own interests.

      Minor comments:

      1. Intro: "SARS-CoV-2 infection in patients with COVID-19 can result in pulmonary distress, inflammation, and broad tissue tropism". Broad tissue tropism is not a result of infection, please rephrase.
      2. Fig S1D - why are the MOI different for WT (moi 0.1) and NLuc mutant (moi 1) ?
      3. Fig S3 - using volume of virus in ul is problematic, as it doesn't allow for proper comparison between the passages. The author would express the virus amount in PFU or MOI.
      4. Fig S5 - in panel A - what do the colors represent? What is 0-1?. The number of repetitions for each panel should be indicated.
      5. The "NLuc activity as a marker of virus replication" and "SARS-CoV-2 replication screen validation" are largely overlapping and should be edited.
      6. Methods: "Generation of functional SARS-CoV-2 virus" - the author confuse "virus" with "plasmid". They should also include the reference marked "(Insert ref 3)"

      Significance

      1. My main concern is that a very similar, if not identical, NLuc encoding virus has been reported in October 2020 (https://www.nature.com/articles/s41467-020-19055-7#Sec9). While the authors cite this paper, they only do so to say that "Orf7a has previously been removed in SARS-CoV and SARS-CoV-2 and yielded infectious and replicative virus particles", without mentioning this was done to generate the same NLuc carrying virus reported in their work. Thus the generation of this virus is not a "new tool" as the authors would seem to suggest.
        1. While drug repurposing screens have been performed, the addition validation in Vero E6 and Huh7 cells is of some interest to those working on anti-viral therapies, given that the authors change their supplementary tables to a format that can be accessible by other researchers.

      My expertise: I study virus-host interactions (not coronaviruse). In the last year I have been involved in several drug repurposing efforts against SARS-CoV-2.

      Referee Cross-commenting

      No problem with the other comments.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary:

      Pickard et al. present in the manuscript entitled "Discovery of re-purposed drugs that slow SARS-CoV-2 replication in human cells" a new screen of FDA approved drugs against SARS-CoV-2. The authors based their screen on Vero and HUH7 cell lines. The methods applied for screening including the SARS-Cov-2-ΔOrf7a-NLuc modified virus are properly designed and preformed. This is an interesting study that finds several potential drugs that might be effective as anti SARS-CoV-2 therapies. However, such experiments have been done throughout the last year and the novelty and importance of these findings are questionable.

      Major comments:

      1. Most the experiments presented are only done twice, while in the screen itself it should not be a problem, for verifying the drugs identified at least three experiments are suggested (Figure 5 and Supplemental Figure 6)
      2. To strengthen the results of the screen, the wild type virus should also be tested for plaque reduction assay with these nine drugs.
      3. Identification of antivirals is important for SARS-CoV-2 and other coronaviruses, regardless of the presence or effectiveness of vaccines. I think the abstract and introduction should be written to emphasize this point (instead of trying to underestimate the vaccine effectiveness). Similarly, the authors ignore the relative failures of known antivirals (known to inhibit SARS-CoV-2 replication in vitro like Remdesavir) in clinical trials and suggest starting clinical trials with their screen results. I think that this suggestion is premature and require several more studies (including animals studies) before initiating clinical trials.

      Minor comments.

      1. The errors bars are not defined throughout all the figures. I am not sure that error bars are even meaningful if experiments only done twice, I recommend showing the two results for each point.
      2. Figure 1E and the tables especially supp tables 3 and 4? don't have legends.
      3. Most graphs will benefit from presenting the results in logarithmic scale (all Luc counts/ qPCRs).
      4. P6 in the Generation of functional SARS-CoV-2 virus section - a reference is missing "It has been reported that this aids the recovery of replicative virus (Insert ref 3)"

      Significance

      This is a well performed drug screen on two cell lines that identified new potential FDA approved drugs as anti-SARS-CoV-2 inhibitors. There are several studies that already been published or distributed as preprints that have done similar experiments in other cells lines including more relevant lung epithelial cells (for example PMC743673). This study does not verify the screen results by additional methods. However, in the current pandemic situation this study could be important and interesting to follow up.

      I am a virologist; my expertise is in viral host interactions within infected cell.

      Referee Cross-commenting

      No problem with the other comments

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer response

      We thank the reviewers for their response. All reviewers find our study (potentially) interesting and/or a resource to gain further understanding on BAP1 molecular functions. They also have some common comments.

      The reviewers would prefer to see further characterization of the interactions and their functional effects. We would have liked to address this but found for COPI that knockdown of these genes is lethal, whereas on the BAP1 side the interactions are mapped to the functionally critical C-terminus, making these experiments technically extremely challenging. These issues, unfortunately, preclude further validation studies at this point. Nevertheless, we do feel that the quality of our interaction dataset is such that it is be worth publishing these finding for this important tumor suppressor.

      Most reviewers would like us to place the data more in context. To address this, we have extended the discussion, highlighting the essence of our findings and how we envisage this could impact BAP1 function.

      Finally, both reviewer 1 and 3 would like the results section to be more succinct and we have shortened it to improve readability.

      Other points are addressed in the point-by-point response to individual reviewers below.

      Point-by-point response to reviewers

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      Very interesting study on BAP1 tumor suppressor. The work needs further characterization of the interacting partners identified.

      Reviewer #1 (Significance (Required)):

      BAP1 is an important tumor suppressor mutated in several malignancies and the mechanism of action of this deubiquitinase are far from being completely understood.

      This interesting work aimed at identifying novel cytoplasmic partners of BAP1 which can highly relevant to its tumor suppressor function. BAP1 is predominantly nuclear, but can also be found in the cytoplasm. New insights into the cytoplasmic functions of BAP1 are needed.

      The manuscript is overall well written and the data are very solid.

      The manuscript would need additional work before acceptance

      **Comments**

      Reviewer #1 1) The abstract can be improved to reflect the data of the manuscript.

      Unfortunately, we do not understand what part of the abstract is meant by the reviewer, which makes it hard to address.

      2) The result section, manuscript could emphasize the results rather that the technical aspects

      We've improved readability of this part of the manuscript by moving technical parts that are not required for interpretation of the results to the material and method section, a supplementary text and a new supplemental figure 1 (causing the original numbering to shift).

      3) It would be interesting to further investigate the significance of some key interactions

      We agree that these questions are of importance (see reviewer 2 point 2, reviewer 3 point 1). We have tried to address these questions using gene knockdown techniques. However, the importance of regulation of protein transport and vesicle formation by COPI translates to lethal effects on cell viability upon knockdown of these genes making these experiments technically impossible to execute. Further functional investigation is technically and financially beyond the scope and possibilities of this paper.

      4) The discussion is quite short and can put the findings in perspective

      We've extended the discussion to place results into perspective, also regarding the potential role of BAP1 activity towards potential substrates (point 8). This will help to highlight the important findings of the research.

      5) It would be interesting to test some cancer-associated mutations

      The interaction is in the C-terminus of BAP1, which combines several functions[1, 2]. This would dramatically complicate the interpretation of results. Particularly the presence of the NLS, a major regulatory posttranslational modification[3] and the recruitment signal for nucleosomes could all interfere with BAP1 function independently.

      6) Figure 4 can be improved

      Thanks for this comment. We have increased readability of the figure by addition of schematic representations of the used constructs, a legend that explains the color-coding of the interactors. We have also removed dotted lines to make it less busy.

      7) Yu et al MCB 2010 is one of the key papers on BAP1 purification and can be cited

      We apologize for omitting this reference and have included it in the revised manuscript.

      8) The authors can discuss potential substrates of BAP1 and mechanism of deubiquitination

      We've extended the discussion to this extent.

      **Referee Cross-commenting**

      I agree with the comments of Reviewer #2

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      **Summary:**

      BAP1 is a deubiquitinase that mainly functions to control H2Aub levels in nucleus. BAP1 is a tumor suppressor with mutations or deletions in several human cancers. Previous studies have identified many interacting proteins with BAP1, most notably its binding partners involved in PR-DUB complex, such as ASXL1/2, FOXK1/2, HCFC1, OGT etc. In this study, by using FRT-mediated recombination, the authors generated tagged-BAP1 expressed from its endogenous promoter and conducted AP-MS analysis to identify BAP1 interacting proteins in both nuclear and cytoplasmic fractions. These analyses identified several new BAP1-interacting proteins in cytoplasmic fractions, including histone acetyltransferase 1 (HAT1) and the heptameric coat protein complex I (COPI, which is involved in protein sorting and trafficking). The authors further confirmed the interactions between BAP1 and HAT1 (as well as a COPI subunit) at the endogenous level.

      **Major comments:**

      Overall, the current study has relatively limited data with limited scopes: basically a proteomic study focusing on one single protein (which has already been subjected to several proteomic studies in previous publications). There is a significant room for authors to improve this potentially interesting study (see below for specific comments), although this may take substantial additional efforts.

      1. In the current manuscript, there is no data to further characterize the interactions between BAP1 and HAT1 (or COPI). For protein-protein interaction studies, readers generally are interested in information such as whether these bindings are direct or indirect (particularly for COPI because it contains multiple subunits), and which regions mediate the interactions?

      Our mass spectrometry data suggests binding of BAP1 to be mediated through its C-terminus. Mutations of the KxKxx domain have shown that this motif is not involved. Mapping the interaction any more specifically is likely to be very difficult as the C-terminus of BAP1 is involved in many different functions and contains many important elements (ULD domain, CTE, NLS) required for its function. Mutational analysis aimed to map the interaction will induce many secondary effects as the protein localization and substrate targeting will be severely affected as shown by us and other research groups. Identifying what subunit of the COPI complex is mediating the interaction requires purification of these proteins along with purified full length active BAP1, for which attempts have been made but were still unsuccessful. Further investigation is technically and financially beyond the scope and possibilities of this paper.

      No data to study the functional significance of the identified protein-protein interactions. This is a major weakness of the current study. For example, HAT1 is a histone acetyltransferase and mainly functions in the nucleus. Does the BAP1-HAT1 interaction in cytoplasm suggest that they have functions in cytoplasm independent of their canonical function in regulating transcription in the nucleus? Likewise, does BAP1-COP1 interaction suggest that somehow BAP1 is involved in regulating protein sorting and trafficking?

      Please see our general response and reviewer 1 point 3.

      The identified interactions appear to be weak. These proteins are located near the edge of the significance curve in volcano plots (Fig. 1C-1D and others). The IP data also appear to be weak; for example, see Fig. 5D, it's hardly to see COPA blot in BAP1 IP. The COPA IB signal from 5% input WCL is probably hundred-fold stronger than that from BAP1 IP. Weak interactions do not necessarily mean they are not important; however, there is no functional data to support this claim (see the point above).

      The essence of our paper is that the interactions which are barely visible in figure 1, gain significance in the absence of endogenous BAP1 to the point where all COPI subunits are as confidently identified as previously validated BAP1 interactors like ASXL, FOXK and HCFC proteins (figure 4). Our quantifications indicate that the new interactors have lower stoichiometry, and this may explain why they were harder to identify. This observation is discussed in the discussion section.

      It seems that the entire study focuses on one specific cell line. Repeat the analyses in other cell lines can help boost the robustness and significance of the study.

      As discussed under the previous point, the removal of endogenous BAP1 was important for significance, and since BAP1 is a common essential gene, we don't have any other cell lines in which this would be possible. However, we have been able to confirm the HAT1 interaction with BAP1 in U2OS on endogenous levels (Fig 1E,F).

      The major interacting proteins they identified from the cytoplasmic fraction are still those mainly localized in nucleus (such as HCFC1, FOXK1/2). A western blotting to show nuclear vs cytoplasmic fraction is required.

      Our immunoblot containing cytoplasmic and nuclear input samples used for figure 4 show proper separation of fractions without major leakage as shown in supplemental figure 4 (Tubulin and Abraxas lane as cytoplasmic and nuclear markers respectively), while iBAQ data show a substantial amount of protein to be bound (stoichiometry.xls). This is corroborating with the sample correlation data shown in supplemental figure 5B which shows very little correlation between cytoplasmic and nuclear samples in the mass spectrometry experiment. These data show that the interactions of the cytoplasmic partners that are mainly localized in the nucleus are real interaction and are not due to mixing of cellular compartments.

      **Minor comments:**

      1. page 9 "A GFP coIP experiment of both GFP-BAP1 and the catalytic-dead BAP1 C91S mutant shows coimmunoprecipitation of HAT1 (Figure 1F)." here Fig. 1F should be Fig. 1E.

      The Figure was indeed mislabeled. Because of the addition of a new supplemental figure for reviewer 1 point 2 some figure numbers have shifted. The old figure 1E has now become 1C and is now numbered accordingly.

      Reviewer #2 (Significance (Required)):

      The current study is limited in scope. Without functional data for these interactions, the overall significance of the study is likely limited.

      **Referee Cross-commenting**

      My overall assessment is similar to the other two reviewers (particularly reviewer 3): the study is rather descriptive, limited in scope, and lacks mechanistic understanding of BAP1 functions: for example, see reviewer 1 comment "it would be interesting to further investigate the significance of some key interactions"; reviewer 3 comment "The present manuscript contained very little information beyond description of BAP1 interactomes and subsequent validation of BAP1-COPI interaction. In the very least, I would recommend for the authors to explore contextual significances and/or regulations of the novel BAP1-COP1 interaction."

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      In the present manuscript, Baas and colleagues seek to identify novel BRCA1-associated protein 1 (BAP1) interacting partners. To do so, the authors performed affinity purifications of different GFP-tagged BAP1 constructs in combination with mass spectrometry from either wild-type HeLa or those, which endogenous BAP1 expression had been knocked-out using CRISPR/Cas9. MS analysis of the pull-downs revealed COPI as a novel cytoplasmic interactor for the full-length GFP-BAP1 in addition to to other previously known BAP1 interactors such as HAT1, ASXL1/2, FOXK1/K2, OGT.

      The authors subsequently went on to validate the BAP1-COPI interaction and observed that such interaction was independent of the canonical COPI binding motifs KxKxx present in the BAP1 C-terminus.

      **Major comments:**

      The present manuscript contained very little information beyond description of BAP1 interactomes and subsequent validation of BAP1-COPI interaction. In the very least, I would recommend for the authors to explore contextual significances and/or regulations of the novel BAP1-COP1 interaction.

      Please see our general response and reviewer 1 point 3.

      The present manuscript could be written in more concise manner.

      We have shortened the results section as discussed under reviewer 1 point 2 to make it more concise.

      Reviewer #3 (Significance (Required)):

      While the present study may provide a resource to gain further understanding on BAP1 molecular functions, it is very difficult to appreciate the significance of the presence manuscript in the current descriptive form.

      We have expanded the discussion to better explain the significance of our findings.

      1. Sahtoe, D.D., et al., BAP1/ASXL1 recruitment and activation for H2A deubiquitination. Nat Commun, 2016. 7: p. 10292.
      2. Ventii, K.H., et al., BRCA1-associated protein-1 is a tumor suppressor that requires deubiquitinating activity and nuclear localization. Cancer Res, 2008. 68(17): p. 6953-62.
      3. Mashtalir, N., et al., Autodeubiquitination protects the tumor suppressor BAP1 from cytoplasmic sequestration mediated by the atypical ubiquitin ligase UBE2O. Mol Cell, 2014. 54(3): p. 392-406.
    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      In the present manuscript, Baas and colleagues seek to identify novel BRCA1-associated protein 1 (BAP1) interacting partners. To do so, the authors performed affinity purifications of different GFP-tagged BAP1 constructs in combination with mass spectrometry from either wild-type HeLa or those, which endogenous BAP1 expression had been knocked-out using CRISPR/Cas9. MS analysis of the pull-downs revealed COPI as a novel cytoplasmic interactor for the full-length GFP-BAP1 in addition to to other previously known BAP1 interactors such as HAT1, ASXL1/2, FOXK1/K2, OGT.

      The authors subsequently went on to validate the BAP1-COPI interaction and observed that such interaction was independent of the canonical COPI binding motifs KxKxx present in the BAP1 C-terminus.

      Major comments:

      The present manuscript contained very little information beyond description of BAP1 interactomes and subsequent validation of BAP1-COPI interaction. In the very least, I would recommend for the authors to explore contextual significances and/or regulations of the novel BAP1-COP1 interaction.

      The present manuscript could be written in more concise manner.

      Significance

      While the present study may provide a resource to gain further understanding on BAP1 molecular functions, it is very difficult to appreciate the significance of the presence manuscript in the current descriptive form.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary:

      BAP1 is a deubiquitinase that mainly functions to control H2Aub levels in nucleus. BAP1 is a tumor suppressor with mutations or deletions in several human cancers. Previous studies have identified many interacting proteins with BAP1, most notably its binding partners involved in PR-DUB complex, such as ASXL1/2, FOXK1/2, HCFC1, OGT etc. In this study, by using FRT-mediated recombination, the authors generated tagged-BAP1 expressed from its endogenous promoter and conducted AP-MS analysis to identify BAP1 interacting proteins in both nuclear and cytoplasmic fractions. These analyses identified several new BAP1-interacting proteins in cytoplasmic fractions, including histone acetyltransferase 1 (HAT1) and the heptameric coat protein complex I (COPI, which is involved in protein sorting and trafficking). The authors further confirmed the interactions between BAP1 and HAT1 (as well as a COPI subunit) at the endogenous level.

      Major comments:

      Overall, the current study has relatively limited data with limited scopes: basically a proteomic study focusing on one single protein (which has already been subjected to several proteomic studies in previous publications). There is a significant room for authors to improve this potentially interesting study (see below for specific comments), although this may take substantial additional efforts.

      1. In the current manuscript, there is no data to further characterize the interactions between BAP1 and HAT1 (or COPI). For protein-protein interaction studies, readers generally are interested in information such as whether these bindings are direct or indirect (particularly for COPI because it contains multiple subunits), and which regions mediate the interactions?
      2. No data to study the functional significance of the identified protein-protein interactions. This is a major weakness of the current study. For example, HAT1 is a histone acetyltransferase and mainly functions in the nucleus. Does the BAP1-HAT1 interaction in cytoplasm suggest that they have functions in cytoplasm independent of their canonical function in regulating transcription in the nucleus? Likewise, does BAP1-COP1 interaction suggest that somehow BAP1 is involved in regulating protein sorting and trafficking?
      3. The identified interactions appear to be weak. These proteins are located near the edge of the significance curve in volcano plots (Fig. 1C-1D and others). The IP data also appear to be weak; for example, see Fig. 5D, it's hardly to see COPA blot in BAP1 IP. The COPA IB signal from 5% input WCL is probably hundred-fold stronger than that from BAP1 IP. Weak interactions do not necessarily mean they are not important; however, there is no functional data to support this claim (see the point above).
      4. It seems that the entire study focuses on one specific cell line. Repeat the analyses in other cell lines can help boost the robustness and significance of the study.
      5. The major interacting proteins they identified from the cytoplasmic fraction are still those mainly localized in nucleus (such as HCFC1, FOXK1/2). A western blotting to show nuclear vs cytoplasmic fraction is required.

      Minor comments:

      1. page 9 "A GFP coIP experiment of both GFP-BAP1 and the catalytic-dead BAP1 C91S mutant shows coimmunoprecipitation of HAT1 (Figure 1F)." here Fig. 1F should be Fig. 1E.

      Significance

      The current study is limited in scope. Without functional data for these interactions, the overall significance of the study is likely limited.

      Referee Cross-commenting

      My overall assessment is similar to the other two reviewers (particularly reviewer 3): the study is rather descriptive, limited in scope, and lacks mechanistic understanding of BAP1 functions: for example, see reviewer 1 comment "it would be interesting to further investigate the significance of some key interactions"; reviewer 3 comment "The present manuscript contained very little information beyond description of BAP1 interactomes and subsequent validation of BAP1-COPI interaction. In the very least, I would recommend for the authors to explore contextual significances and/or regulations of the novel BAP1-COP1 interaction."

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Very interesting study on BAP1 tumor suppressor. The work needs further characterization of the interacting partners identified.

      Significance

      BAP1 is an important tumor suppressor mutated in several malignancies and the mechanism of action of this deubiquitinase are far from being completely understood.

      This interesting work aimed at identifying novel cytoplasmic partners of BAP1 which can highly relevant to its tumor suppressor function. BAP1 is predominantly nuclear, but can also be found in the cytoplasm. New insights into the cytoplasmic functions of BAP1 are needed.

      The manuscript is overall well written and the data are very solid.

      The manuscript would need additional work before acceptance

      Comments

      1) The abstract can be improved to reflect the data of the manuscript.

      2) The result section, manuscript could emphasize the results rather that the technical aspects

      3) It would be interesting to further investigate the significance of some key interactions

      4) The discussion is quite short and can put the findings in perspective

      5) It would be interesting to test some cancer-associated mutations

      6) Figure 4 can be improved

      7) Yu et al MCB 2010 is one of the key papers on BAP1 purification and can be cited

      8) The authors can discuss potential substrates of BAP1 and mechanism of deubiquitination

      Referee Cross-commenting

      I agree with the comments of Reviewer #2

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1

      Summary:

      In this work the authors present a simple mathematical model for the distribution of morphogen molecules that travel via cytonemes through a 1- dimensional system. This model is used as a basis for a software package called Cytomorph that takes as an input a set of experimentally measured distributions of cytoneme dynamics as well as experimenter determined parameters such as contact probability and method of cytoneme growth and retraction. The Cytomorph package then outputs spatial and temporal information on the distribution of morphogen as well as cytonemes and their contacts with cells and other cytonemes, all obtained over thousands of simulation runs. A number of in silico experiments are then performed to show that these outputs agree with experimentally measured morphogen distributions of Hedgehog in the imaginal wing disc and abdominal histoblast nest. Further in silico experimentation is done to study how this distribution is affected by a wide array of parameters such as producer row number, cytoneme connection method, and connection probability function. Comparisons to the traditional diffusion based model are also made. The authors find a suite of results based on these experiments and accordingly present the Cytomorph software package as a useful and adaptable tool for the community.

      Major comments:

      While the various in silico experiments present an expansive and exhaustive study of the different ways in which Cytomorph can be used to examine a cytoneme based distribution system, the machinery behind the software is left notably underdescribed. The authors do not sufficiently make clear what exactly happens within each iteration of the simulations run by Cytomorph, leaving the results irreproducible without the reader going into and deciphering the software code itself.

      In order to improve the description of the mathematical and computational steps behind the software, we have created a visual organigram (new Supplementary Figure S.1) with a detailed depiction of the steps. We have also included a short description in the main text and an extended explanation in the Supplementary Material section.

      Some of the specific details left undiscussed are how it is determined when and where a cytoneme will spawn or what its maximum length will be, the dynamics of morphogen transport within the cytonemes, the effects of one cytoneme making multiple connections on how much morphogen is delivered through each connection, and where exactly stochasticity is introduced so as to allow for variations between simulation runs; amongst others.

      In the new description of the software steps, we have tried to address the Referee’s comments about the dynamics and stochasticity in more detail. In order to help the understanding of the variables, we have also tried to improve their description in the main text.

      Additionally, when the authors investigate the diffusion model their stated boundary conditions do not match those presented at the end of the Materials and Methods section. The initial condition u(x,0)=0 and boundary condition du(L,t)/dt=0 represent a perfectly absorbing molecule sink at the x=L end of the system, not the reflecting boundary condition du(L,t)/dx=0 that would correspond to a zero morphogen flux.

      We thank the Referee for noticing this annotation mistake since the equation is really dx instead of dt. We have corrected this error and included in the Supplementary Materials the exact lines of code used in Matlab pdepe to certify the conditions used in the resolution of the diffusion equation (new Supplementary Figure S.10).

      Finally, while the authors spend a great deal of effort analyzing signal variability between simulation runs, there is no effort made to account for the inherently stochastic nature of molecular production, movement, and degradation. Particularly if molecule numbers are small, fluctuations in these processes could greatly increase signal variability. The authors should either address why these fluctuations are negligible or include them in the modelling.

      This work is mainly focused on the transport of the morphogen; other terms as degradation were introduced directly using published experimental data. Regarding the main concern about the negligibility of the fluctuations for cytoneme transport, we agree with the Referee on the importance of this point. Therefore, we have included a detailed description of the variability and fluctuations in a new section of the Supplementary Material. To help its understanding, we have also included a new Supplementary Figure (Supplementary Figure S.11).

      The largest fluctuations were found at the tail of the morphogen gradient (last rows of receiving cells). Since this corresponds to the region where the amount of morphogen is low, the absolute fluctuations do not change the activation of the low-threshold target. We then conclude that those fluctuations are biologically negligible for our study.

      Minor comments:

      The authors should double check all equation and figure references as I noted several instances in which it appeared that the wrong equation or figure was being referred back to. Similarly, the authors should double check the equations themselves, particularly those in the supplemental material.

      We thank the Referee for noticing these mistakes. We have reviewed those references in order to fix the wrongly linked ones.

      Eqs. SM1.1 and SM1.2 have a plethora of parameters with a wide array of different sub- and superscripts that are left unexplained and possibly incorrectly labelled in some cases,

      Equations SM1.1 and SM1.2 described a general form of Triangular and Trapezoidal dynamics and the different sub- and superscripts come from the published experimental data. Nevertheless, in order to make them more intuitive we have simplified the expressions and included a more detailed description of those parameters and their scripts in the revised version.

      while the second line of Eq. SM2.2 is nonsensical unless r_I*p=0 and p_i<=1.

      We thank the Referee for noticing the uncertainty in this equation, since it was written in an iterative syntax as it is coded in the software. Therefore, in the code we did not have this nonsensical range of data, but we agree that it should be specified with a mathematical syntax as the rest of the equations in the manuscript. Therefore, we redefined the notation and specified better the numerical domains of those variables.

      Additionally, the notation used in Figs. 5 and 6 as well as the bottom part of Fig. 7 is confusing. The caption should more explicitly state what the various expressions in the second row of each column represent.

      The second row represents the statistical analysis between cases coded in a color matrix, as it is described in the footnote. We thank the Referee for this recommendation because this is not the usual representation. Therefore, we have changed the previous explanation to one hopefully clearer and intuitive; we have also included a specific label in the figures.

      In Fig. 5A specifically it is unclear what exactly the variable phi represents.

      Phi is a widely used annotation in biology to define cell size diameter and cell position. We didn´t realize it could be unclear. For a better understanding within a multidisciplinary field we have changed this symbol.

      Does it have anything to do with the phi that is used as a position variable for the cells, and if it is a ratio of cytoneme length to cell diameter then why does it have units of microns?

      We agree that this phi notation is confusing. It has been used to indicate distance position as well as cell diameter. Although these variables are biologically related, in the new version of the manuscript we have changed the notation to separate both concepts and avoid misunderstandings.

      Significance:

      As the Cytomorph model and software can be applied to a wide variety of systems involving morphogen transport via cytonemes, it provides a technical advance in our ability to analyze and discuss the results of measurements on cytonemes in a more homogenous way. This work and the resulting software is particularly applicable to and build off of studies done by other groups that study the dynamics of cytonemes such as the Kornberg lab (works from which are cited by the authors) and the Scholpp lab (such as Stanganello E, Scholpp

      S. Role of cytonemes in Wnt transport. J Cell Sci. 2016; 129(4):665-672), and as such it is experimental labs such as these that will be the most interested in this manuscript and its findings.

      My field of expertise lies primarily in stochastic modeling and linear response theory. As such, I feel I do not have sufficient expertise to evaluate the experimental methods outlined in this manuscript and determine their level of scientific rigor.

      Reviewer #2

      The manuscript "Improving the understanding of cytoneme-mediated morphogen gradients by in silico modelling" addresses the role of in silico modelling in understanding pattern formation via cytonemes: filopodia that transport signalling molecules to and from cells. Investigating the role of cytonemes and, in particular, their dynamics, during development is an important and emerging field in developmental biology, and there is great potential for mathematical modelling to aid in understanding these processes.

      The present manuscript attempts to derive a general set of equations describing pattern formation in the context of cytonemes, akin to that of the classic Turing model of morphogenesis. The authors replace the standard diffusion term in the PDE with a non-local term, intended to represent transport via cytonemes. This model is then posed over a one-dimensional domain with a source at one end and no flux boundary conditions at the other and is shown to be able to generate a morphogen gradient profile that could pre-pattern a biological tissue. The model is tested against a key experimental system, namely, Hh signalling in the Drosophila wing imaginal disc and is shown to reproduce some experimental results. Finally, the authors have developed a Matlab-based software package that they claim will be applicable to a wide range of systems. This GUI-based software allows users to input experimentally measured averages of cytoneme properties and explore the effect of these properties on tissue patterning.

      My primary concern is that the paper presents itself as a mathematical model of cytoneme formation in general. The authors themselves state in their introduction that the mechanisms for cytoneme generation and maintenance are presently unknown. In fact, it is not even known if they are consistent across biological systems (and in fact, are probably not in general). As such, any present instantiation that connects cytoneme dynamics to tissue patterning can only hope to be specific to a particular system (in this case, the Drosophila wing imaginal disc.

      As mentioned in the introduction, the connection of cytonemes with patterning has been described in several works. We had included a list of publications describing the implication of cytoneme-mediated signaling for several morphogens (FGF, Egf, Hh, Dpp, Wnt or Notch) and in many vertebrate and invertebrate systems (Drosophila, chicken, Xenopus, Zebra fish, mouse and human tissue culture cells).

      Whilst one may use general models (like the heat equation) to study pattern formation since it requires only specification of parameters, the model here requires specification of families of functions, that are likely to differ from context to context and so the model is not general.

      Our model inputs are parameters determined experimentally rather than families of functions. This misunderstanding might derive from the use of triangular and trapezoidal dynamics, which are equations included in the software code but not input functions. To avoid this confusion, we have specified the input data in tables S.1 and S.2 and clarified in the main text that the triangular or trapezoidal family of functions are just the names for the basic dynamics of cytonemes (triangular for elongation and retraction, and trapezoidal when there is a stationary phase in between).

      Ultimately, the model is a statistical modelling framework masquerading as a mechanistic one.

      In this work, we have not specified the mathematical area to which the model belongs. Furthermore, we always explicitly described the different variables and functions modeled. Therefore, we do not understand what the supposed masquerade is.

      As further evidence of the lack of generality of the model, the studied domain is only one dimensional and has signalling sources at one end. This scenario is perfectly adequate for theoretical explorations of pattern-forming systems but is highly unlikely to capture the geometrical intricacies of real-world systems (and I note that even in the diffusive case, boundary conditions are critical for understanding what patterns ultimately arise for a given system).

      We agree with the Referee that there are cases in biological systems in which it is required to work in 2D or even 3D to have a full comprehension of the process. Nevertheless, those are mainly related to biological patterns rather than to biological signaling gradients, which usually are studied (experimental and theoretically) in 1D. Therefore, we have limited our model to this case and compared our in silico results with the published experimental data. In any case, we have emphasized in the text that our model is limited to signaling gradients with the source at one end, which is the case of the best studied morphogens: Hh (Sonic-hh), Dpp (BMP) or Wg (Wnt).

      Actually, as prove of the generality of the model, we have predicted different properties of Dpp and Wg gradients using our model. We then validated the simulated results using the experimental data obtained from independent publications.

      To simulate their model, the authors need to specify triangular and trapezoidal functions, which are unlikely to be generalisable to all contexts. As such, the model is not general and, in particular, there is no way to change the software to make it so.

      Cytonemes are filopodial structures based on actin filaments that polymerize and depolymerize to elongate and retract. This is a general process for all filopodial structures and it is why cytonemes were classified in a previous published work as a triangular behavior or, if this dynamic has a stationary phase, as a trapezoidal behavior (Gonzalez-Méndez et al., 2017). Therefore, these functions are just a categorization introduced to better describe the intrinsic dynamics of cytonemes, that could be applied to most of the experimental cases. To attend this Referee’s concern, we have included in the introduction a more detailed description of these behaviors, as well as the references of publications describing the dynamic behaviors of cytonemes for different morphogens and in different organisms.

      Trying to make a generalization for all cases, we included in the model those situations in which the cytonemes were static rather than dynamic (detailed simulations comparing dynamic and static cases can be found in the old Supplementary Figure S.5 A (now S.7 A)).

      We have concluded that the model can be considered generalizable since it includes the simplest and most general cases in terms of cytoneme dynamics.

      Whilst the development of a GUI for this scenario is a nice contribution, I feel that the lack of generalisability will, at best, mean that the software enjoys little use, and at worst, may lead researchers unfamiliar with the modelling context to misuse it in error.

      Once we knew the model could be generalized, we were concerned about the misuse of the mathematical model, and that was the reason why we decided to develop a GUI as simple as possible.

      Furthermore, in the online repository there is, together with the open software, an user guide of Cytomorph with a full description of parameters, variables and outputs and how to use them properly.

      In my opinion, this work would be better suited as a presentation of specific mathematical modelling of tissue patterning in the Drosophila wing imaginal disc. In this case, many of the above concerns would be addressed.

      We have rewritten part of the text to indicate the limits of the model and make clear that it has been tested experimentally for the Hh pathway and in two different developing systems: wing imaginal discs and abdominal histoblast nests.

      As evidence of a more general use of Cytomorph, we have added in the revised version of the manuscript a new section focused on data prediction for the gradients of Dpp and Wg. We have also included supplementary figures that validate the predictions of our model using published experimental data.

      That said, there are still a number of issues with the presentation of the model and results. I shall detail these in the bullet point list below:

      1. The domain for Eq. 1 needs to be made explicit. Later, it appears that the domain is a closed one-dimensional interval, but the use of arrows here implies that x is a vector and hence x ∈_ D _Rn with n > 1.

      We initially described the general equation for morphogens as x ∈ ℝ𝑛 and later we limited it to 1D. This is why at the beginning x, as a vector, contained an arrow, although later it was a scalar variable. Since we were interested in 1D in this work, to avoid this kind of misunderstanding we have rewritten from the beginning the equations as 1D and clearly specified the x domain used: the set of natural numbers x ∈ ℕ0.

      1. It is unclear over what the sum in Eq. 2 is being taken.

      The sum in Eq. 2 is over the number of producing cell rows. We have changed the notation to clarify this point.

      1. The statement "we used the discrete cell position x = φ as spatial coordinate" is vague and does not help the reader understand the discretization._

      The number of cell diameters is a widely used discrete unit for position in Developmental Biology. As we expect the readers of this publication to be multidisciplinary, we have changed the notation to avoid misunderstandings and clarify this discretization.

      1. p is used both as a probability and as an index for producer cells. This is confusing._

      We have changed the notation to avoid misunderstandings.

      1. As previously stated, the choice of trapezoidal/triangular cytoneme dynamics is not general. More work needs to be done to showcase how the authors came to the conclusion that this is the best choice, and how the functions (and their associated parameters) describing them were selected.

      The names triangular and trapezoidal stand for the published dynamics for elongation and the retraction of cytonemes and we already argued about its generality. As we specified in the manuscript, these types of behaviors have been experimentally observed and, therefore, we considered that the experimental observation was reason enough to include them in the model. If more details are required, the Material and Method section and the Supplementary Table S.3 show that the times measured for triangular and trapezoidal dynamics are statistically different and, consequently, both behaviors have to be considered.

      As mentioned in the manuscript, the associated parameters represent the times and velocities for the elongation or retraction that have already been thoroughly analyzed and published (González-Méndez et al., 2017). The question of the Referee about how these functions affect the gradient is answered in the text and in Figure 7 F.

      1. I can see how Type 1 and Type 2 cytonemes could be expanded naturally to a higher dimensional case, but it is not clear how Type 3 cytonemes could be, since the probability of any two cytonemes occupying the same space in higher dimensions is likely to be small (if they are imbued with independent dynamics).

      We agree with the Referee on this point. It is something that shall be considered for future improvements of the model in higher dimensions. For instance, a complex scenario in 2D will be required of a cytoneme guiding model. Nevertheless, since the present study is limited to 1D, this concern is not applicable for the current model.

      1. The statement: "the distance between cells must be smaller than, or equal to, the maximum length of the cytonemes" seems inconsistent with the equations below since λ(t) does not appear to be a maximum length.

      The length of the cytonemes is controlled as a dynamic function described by λ(t). Our statement referred to the maximum length for each time step that is given by λ(t). We agree that the initial statement could lead to misunderstanding, so we have suppressed the word “maximum”.

      1. I think the authors are confusing probabilities and rates in their discussion of the model. Eq. 1 is a density model and so calling events probabilities here is slightly misleading. As a more general statement, I am currently interpreting contact function C as one defined as a rate, rather than as a set of probabilistic terms. If the latter is true, then Eq. 1 is invalid since it mixes processes at different levels of description._

      We thank the Referee for this comment. We have studied in depth this observation but we could not exactly find why the Referee considers that the model is working at different levels. Even though we could not find where in the text we called “probabilities” to the events of eq1, we rewrote the text to make clear what we consider either probability or rate. In addition, in the Supplementary Material section we clarify how the model works and at what levels of modeling we are working.

      Significance

      In general, the paper is well written, however, the focus of the findings should be on patterning within an epithelium such as the Drosophila wing imaginal disk.

      The work will be interesting for the developmental biology community as well as for the upcoming biomathematical modelling community.

      Expertise: Developmental biologist with experience in tissue patterning and morphogen gradients

      Referees cross-commenting

      I agree with Reviewer 3 that the importance of cytoneme-mediated signalling has been described in several systems - invertebrates and vertebrates. However, I think the focus of this work in particular should be on cytoneme signalling in the wing imaginal disc. IMO, this would not limit the conclusion but rather focus it and make it then applicable to epithelial tissues in general. I agree with the other point.

      Reviewer #3

      There is much to like in this thoughtful and worthwhile study that develops mathematics to describe how cytonemes might generate experimentally observed Hh gradients. Two suggestions:

      1. I am not equipped to evaluate the mathematics and as a non-expert would find it helpful if the authors explicitly stated at the outset what assumptions they took, the specific contexts they sought to model, and the parameters that they explored.

      We agree with the Referee on the excessively mathematical focus of our interpretation of the results in the old version of the manuscript. We have rewritten part of the text to clarify the biological implications of the variables and simulations explored.

      Am I correct that they assume that the Hh gradient correlates with a cytoneme gradient, that all cytoneme contacts have the same duration and exchange equivalent amounts of Hh, and that the variables that were characterized are cytoneme length distributions, cytoneme extension rate, contact duration, and cytoneme density?

      Since the mechanism of morphogen exchange is not fully identified, we assumed the simplest case in which all the contacts have the same duration and exchange the same amount of morphogen. Using this approach, we were able to reproduce the gradient and concluded that it is not strictly necessary to propose a more complex mechanism to establish a graded distribution of morphogens. We therefore worked under this assumption.

      The variables characterized were the ones pointed out by the Referee, mainly cytoneme features, as the cytoneme length distributions or the different parameters of the temporal dynamics. We tried to define better these variables in the new version of the manuscript.

      1. One of the unusual features of the Hh gradient in the wing disc is that the size of the posterior compartment field of Hh-producing cells is large relative to the size and extent of the Hh gradient in the adjacent anterior compartment. Wing discs with large hh mutant clones, wing discs with large smo mutant clones, and wing discs with ttv mutant clones that block Hh uptake provide evidence that the Hh gradient is constituted with Hh that is produced by many cells, some that are far from the compartment border as well as some that are close. Has this been factored into the author's model?

      Indeed. Being aware of the importance of the size of the signal source, we simulated how changing the size of the posterior compartment affects the gradient (altering the number of producing cell rows involved, figure 5B). In the old version of the manuscript we had focused on the theoretical approach, so we thank the reviewer for noticing that we should introduce a more biological point of view. Therefore, we included in the revised version of the manuscript a biological interpretation of how our simulations can help to understand the question posed by the reviewer.

      Does the fact that the relative size of the posterior compartments and Hh gradients in the histoblasts is not as extreme as it is in the wing disc influence their model?

      Following the Referee’s question, we decided to simulate the influence of the relative size of the posterior compartment in the abdominal histoblast nests. We found that in both wing discs and histoblasts, the size of the posterior compartment affects the gradient but in a different scale factor. We have included these data in the revised version of the manuscript (new supplementary figure S.5).

      Interestingly, this feature of the Hh gradient in the wing disc is not shared with other gradients in the wing disc such as the Wg, Dpp, and Bnl gradients. I would be interested to know if the author's model can be queried to suggest what properties might contribute to this difference?

      In order to answer the reviewer question, we have used our model to tentatively simulate Wg and Dpp gradients. Our preliminary results suggest that considering only cell position and cytoneme length, the Wg and Dpp gradient lengths can be predicted in wing imaginal disc. Nevertheless, each morphogen has its own particularities and further studies are required for a precise simulation of these gradients. We included these results in a new section of the manuscript and in the new Supplementary Figure S.9.

      Significance

      This is an important contribution to gaining a basic understanding of the role of various properties of dynamic cytonemes to gradient formation.

      Referees cross-commenting

      I discount the apparently strongly held opinion of Reviewer #2 that "it is not even known if they [cytonemes] are consistent across biological systems (and in fact, are probably not in general)". I do not know where this comes from and do not think that such opinions are appropriate for anonymous reviews.

      Cytoneme-mediated signaling has in fact been observed and characterized in many diverse biological systems. I submit that in contrast, mechanisms of dispersion based on diffusion are inferred and lack direct experimental evidence. I do agree that it is fair to ask the authors to carefully describe their work in the context of epithelial signaling, but it is not correct to ask them to limit their conclusions to the wing disc as the authors analyze both wing disc and histoblast signaling. They clearly state that their work is limited to 1D and so we understand that it is inadequate to model 3D morphologies. I do not criticize them for this.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      There is much to like in this thoughtful and worthwhile study that develops mathematics to describe how cytonemes might generate experimentally observed Hh gradients. Two suggestions:

      1. I am not equipped to evaluate the mathematics and as a non-expert would find it helpful if the authors explicitly stated at the outset what assumptions they took, the specific contexts they sought to model, and the parameters that they explored. Am I correct that they assume that the Hh gradient correlates with a cytoneme gradient, that all cytoneme contacts have the same duration and exchange equivalent amounts of Hh, and that the variables that were characterized are cytoneme length distributions, cytoneme extension rate, contact duration, and cytoneme density?
      2. One of the unusual features of the Hh gradient in the wing disc is that the size of the posterior compartment field of Hh-producing cells is large relative to the size and extent of the Hh gradient in the adjacent anterior compartment. Wing discs with large hh mutant clones, wing discs with large smo mutant clones, and wing discs with ttv mutant clones that block Hh uptake provide evidence that the Hh gradient is constituted with Hh that is produced by many cells, some that are far from the compartment border as well as some that are close. Has this been factored into the author's model? Does the fact that the relative size of the posterior compartments and Hh gradients in the histoblasts is not as extreme as it is in the wing disc influence their model? Interestingly, this feature of the Hh gradient in the wing disc is not shared with other gradients in the wing disc such as the Wg, Dpp, and Bnl gradients. I would be interested to know if the author's model can be queried to suggest what properties might contribute to this difference?

      Significance

      This is an important contribution to gaining a basic understanding of the role of various properties of dynamic cytonemes to gradient formation.

      Referees cross-commenting

      I discount the apparently strongly held opinion of Reviewer #2 that "it is not even known if they [cytonemes] are consistent across biological systems (and in fact, are probably not in general)". I do not know where this comes from and do not think that such opinions are appropriate for anonymous reviews.

      Cytoneme-mediated signaling has in fact been observed and characterized in many diverse biological systems. I submit that in contrast, mechanisms of dispersion based on diffusion are inferred and lack direct experimental evidence. I do agree that it is fair to ask the authors to carefully describe their work in the context of epithelial signaling, but it is not correct to ask them to limit their conclusions to the wing disc as the authors analyze both wing disc and histoblast signaling. They clearly state that their work is limited to 1D and so we understand that it is inadequate to model 3D morphologies. I do not criticize them for this.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      The manuscript "Improving the understanding of cytoneme-mediated morphogen gradients by in silico modelling" addresses the role of in silico modelling in understanding pattern formation via cytonemes: filopodia that transport signalling molecules to and from cells. Investigating the role of cytonemes and, in particular, their dynamics, during development is an important and emerging field in developmental biology, and there is great potential for mathematical modelling to aid in understanding these processes.

      The present manuscript attempts to derive a general set of equations describing pattern formation in the context of cytonemes, akin to that of the classic Turing model of morphogenesis. The authors replace the standard diffusion term in the PDE with a non-local term, intended to represent transport via cytonemes. This model is then posed over a one-dimensional domain with a source at one end and no flux boundary conditions at the other and is shown to be able to generate a morphogen gradient profile that could pre-pattern a biological tissue. The model is tested against a key experimental system, namely, Hh signalling in the Drosophila wing imaginal disc and is shown to reproduce some experimental results. Finally, the authors have developed a Matlab-based software package that they claim will be applicable to a wide range of systems. This GUI-based software allows users to input experimentally measured averages of cytoneme properties and explore the effect of these properties on tissue patterning.

      My primary concern is that the paper presents itself as a mathematical model of cytoneme formation in general. The authors themselves state in their introduction that the mechanisms for cytoneme generation and maintenance are presently unknown. In fact, it is not even known if they are consistent across biological systems (and in fact, are probably not in general). As such, any present instantiation that connects cytoneme dynamics to tissue patterning can only hope to be specific to a particular system (in this case, the Drosophila wing imaginal disc. Whilst one may use general models (like the heat equation) to study pattern formation since it requires only specification of parameters, the model here requires specification of families of functions, that are likely to differ from context to context and so the model is not general. Ultimately, the model is a statistical modelling framework masquerading as a mechanistic one.

      As further evidence of the lack of generality of the model, the studied domain is only one dimensional and has signalling sources at one end. This scenario is perfectly adequate for theoretical explorations of pattern-forming systems but is highly unlikely to capture the geometrical intricacies of real-world systems (and I note that even in the diffusive case, boundary conditions are critical for understanding what patterns ultimately arise for a given system). To simulate their model, the authors need to specify triangular and trapezoidal functions, which are unlikely to be generalisable to all contexts. As such, the model is not general and, in particular, there is no way to change the software to make it so. Whilst the development of a GUI for this scenario is a nice contribution, I feel that the lack of generalisability will, at best, mean that the software enjoys little use, and at worst, may lead researchers unfamiliar with the modelling context to misuse it in error.

      In my opinion, this work would be better suited as a presentation of specific mathematical modelling of tissue patterning in the Drosophila wing imaginal disc. In this case, many of the above concerns would be addressed. That said, there are still a number of issues with the presentation of the model and results. I shall detail these in the bullet point list below:

      1. The domain for Eq. 1 needs to be made explicit. Later, it appears that the domain is a closed one-dimensional interval, but the use of arrows here implies that x is a vector and hence x ∈ D ⊂ Rn with n > 1.
      2. It is unclear over what the sum in Eq. 2 is being taken.
      3. The statement "we used the discrete cell position x = φ as spatial coordinate" is vague and does not help the reader understand the discretization.
      4. p is used both as a probability and as an index for producer cells. This is confusing.
      5. As previously stated, the choice of trapezoidal/triangular cytoneme dynamics is not general. More work needs to be done to showcase how the authors came to the conclusion that this is the best choice, and how the functions (and their associated parameters) describing them were selected.
      6. I can see how Type 1 and Type 2 cytonemes could be expanded naturally to a higher dimensional case, but it is not clear how Type 3 cytonemes could be, since the probability of any two cytonemes occupying the same space in higher dimensions is likely to be small (if they are imbued with independent dynamics).
      7. The statement: "the distance between cells must be smaller than, or equal to, the maximum length of the cytonemes" seems inconsistent with the equations below since λ(t) does not appear to be a maximum length.
      8. I think the authors are confusing probabilities and rates in their discussion of the model. Eq. 1 is a density model and so calling events probabilities here is slightly misleading. As a more general statement, I am currently interpreting contact function C as one defined as a rate, rather than as a set of probabilistic terms. If the latter is true, then Eq. 1 is invalid since it mixes processes at different levels of description.

      Significance

      In general, the paper is well written, however, the focus of the findings should be on patterning within an epithelium such as the Drosophila wing imaginal disk.

      The work will be interesting for the developmental biology community as well as for the upcoming biomathematical modelling community.

      Expertise: Developmental biologist with experience in tissue patterning and morphogen gradients

      Referees cross-commenting

      I agree with Reviewer 3 that the importance of cytoneme-mediated signalling has been described in several systems - invertebrates and vertebrates. However, I think the focus of this work in particular should be on cytoneme signalling in the wing imaginal disc. IMO, this would not limit the conclusion but rather focus it and make it then applicable to epithelial tissues in general. I agree with the other point.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary:

      In this work the authors present a simple mathematical model for the distribution of morphogen molecules that travel via cytonemes through a 1-dimensional system. This model is used as a basis for a software package called Cytomorph that takes as an input a set of experimentally measured distributions of cytoneme dynamics as well as experimenter determined parameters such as contact probability and method of cytoneme growth and retraction. The Cytomorph package then outputs spatial and temporal information on the distribution of morphogen as well as cytonemes and their contacts with cells and other cytonemes, all obtained over thousands of simulation runs. A number of in silico experiments are then performed to show that these outputs agree with experimentally measured morphogen distributions of Hedgehog in the imaginal wing disc and abdominal histoblast nest. Further in silico experimentation is done to study how this distribution is affected by a wide array of parameters such as producer row number, cytoneme connection method, and connection probability function. Comparisons to the traditional diffusion based model are also made. The authors find a suite of results based on these experiments and accordingly present the Cytomorph software package as a useful and adaptable tool for the community.

      Major comments:

      While the various in silico experiments present an expansive and exhaustive study of the different ways in which Cytomorph can be used to examine a cytoneme based distribution system, the machinery behind the software is left notably underdescribed. The authors do not sufficiently make clear what exactly happens within each iteration of the simulations run by Cytomorph, leaving the results irreproducible without the reader going into and deciphering the software code itself. Some of the specific details left undiscussed are how it is determined when and where a cytoneme will spawn or what its maximum length will be, the dynamics of morphogen transport within the cytonemes, the effects of one cytoneme making multiple connections on how much morphogen is delivered through each connection, and where exactly stochasticity is introduced so as to allow for variations between simulation runs; amongst others. Additionally, when the authors investigate the diffusion model their stated boundary conditions do not match those presented at the end of the Materials and Methods section. The initial condition u(x,0)=0 and boundary condition du(L,t)/dt=0 represent a perfectly absorbing molecule sink at the x=L end of the system, not the reflecting boundary condition du(L,t)/dx=0 that would correspond to a zero morphogen flux. Finally, while the authors spend a great deal of effort analyzing signal variability between simulation runs, there is no effort made to account for the inherently stochastic nature of molecular production, movement, and degradation. Particularly if molecule numbers are small, fluctuations in these processes could greatly increase signal variability. The authors should either address why these fluctuations are negligible or include them in the modelling.

      Minor comments:

      The authors should double check all equation and figure references as I noted several instances in which it appeared that the wrong equation or figure was being referred back to. Similarly, the authors should double check the equations themselves, particularly those in the supplemental material. Eqs. SM1.1 and SM1.2 have a plethora of parameters with a wide array of different sub- and superscripts that are left unexplained and possibly incorrectly labelled in some cases, while the second line of Eq. SM2.2 is nonsensical unless r_I*p=0 and p_i<=1. Additionally, the notation used in Figs. 5 and 6 as well as the bottom part of Fig. 7 is confusing. The caption should more explicitly state what the various expressions in the second row of each column represent. In Fig. 5A specifically it is unclear what exactly the variable phi represents. Does it have anything to do with the phi that is used as a position variable for the cells, and if it is a ratio of cytoneme length to cell diameter then why does it have units of microns?

      Significance

      As the Cytomorph model and software can be applied to a wide variety of systems involving morphogen transport via cytonemes, it provides a technical advance in our ability to analyze and discuss the results of measurements on cytonemes in a more homogenous way. This work and the resulting software is particularly applicable to and build off of studies done by other groups that study the dynamics of cytonemes such as the Kornberg lab (works from which are cited by the authors) and the Scholpp lab (such as Stanganello E, Scholpp S. Role of cytonemes in Wnt transport. J Cell Sci. 2016; 129(4):665-672), and as such it is experimental labs such as these that will be the most interested in this manuscript and its findings.

      My field of expertise lies primarily in stochastic modeling and linear response theory. As such, I feel I do not have sufficient expertise to evaluate the experimental methods outlined in this manuscript and determine their level of scientific rigor.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Response to Reviewers "Cell-cell communication through FGF4 generates and maintains robust proportions of differentiated cell types in embryonic stem cells"

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      In this manuscript Raina et al. use an in vitro model of PE specification based on the transient overexpression of GATA4 in ESCs to show that the acquisition of primitive endoderm (PE) identity is governed at the population levels by cell-cell interactions mediated by FGF signaling. The authors further argue that the specification of a defined proportion of "PE" and "Epiblast" cells in a differentiating population of ESC is an emergent property of a system where paracrine signaling shifts the balance between two alternative stable states. Overall, the work does not reach radically new conclusions: broadly similar models are outlined in several other publications, including from the authors. Yet this study makes use of elegant genetic models and is particularly well executed. In addition, it includes a very accurate characterisation of the spatial range of FGF signaling activity that is original and adds on the existing knowledge. Moreover, the authors show novel evidence suggesting that GATA factors inhibits Fgf4 transcription and the activity of the FGF signaling pathway in ESCs.

      We thank the Reviewer for commending the execution of the experiments, and for highlighting the novel insights that they bring. The Reviewer acknowledges that the specification of a defined proportion of PrE-like and Epiblast-like cells in a differentiating population of ESCs is an emergent property which is mediated by paracrine FGF4 signaling. This has not been experimentally demonstrated before. In contrast to the Reviewer’s assertion, we therefore think that our work does reach a conclusion that is radically different from previous experimental studies, a view that is also shared by Reviewer #3 below. In a revised version of the manuscript we will further emphasize the conceptual differences between published models that focus on single cell dynamics, and our experimental and theoretical demonstration of qualitatively different dynamics that emerge at the population level as a consequence of cell fate coupling.

      **Two major points deserve further clarification:**

      In this manuscript the authors claim that the proportions of cells acquiring PE fate is, at least in the experimental setup adopted, largely independent from the levels of GATA4 induction, and therefore of the initial state of the gene regulatory network regulating this cell fate transition. However, the authors should discuss how the current findings relate to their previous results, showing that the duration/levels of Gata4 induction, in a similar experimental setting, play an important role in determining the final proportion of cells cell acquiring "PE" fate. Absolute expression levels may be crucial for this distinction, but the authors seem to exclude this possibility (see figure S3).

      The different roles of GATA4-mCherry induction levels for determining the final proportion of cells acquiring a PrE-like fate reported in our previous (PMID: 26511924) and the current work is because of important differences in the experimental settings between the two studies. In PMID: 26511924, we assayed PrE-like differentiation in medium supplemented with serum and LIF, which provides exogenous signals that promote PrE-like differentiation. These conditions reveal the function of the cell-autonomous circuit, in which GATA4-mCherry levels do control the probability of PrE-like differentiation. In the current work, we likewise observe that cell type proportions depend on GATA4-mCherry induction levels when we supply exogenous FGF4 during the differentiation of wild type cells (Figures S2C and S3D, lower panel). Differentiation in the absence of exogenous factors in contrast reveals the behavior of the coupled system, in which cell type proportions are independent from GATA4-mCherry induction levels.

      Furthermore, in the present manuscript, we use new inducible cell lines in which the majority of cells can be induced above the critical GATA4-mCherry threshold required for PrE-like differentiation, in contrast to our previous study where the distribution of GATA4-mCherry induction levels was straddling this threshold.

      In a revised version of the manuscript, we will more explicitly emphasize these important differences in the experimental design between the two studies, and discuss how the specific conditions in the present study lead to new conclusions.

      Most importantly, the authors incorporate in their model the notion that GATA6 inhibits FGF signaling. It would be interesting to understand how such inhibition is mechanistically mediated. For instance GATA6 has been shown to bind in proximity of the Fgfr2 gene (Wamaitha et al., Genes and Dev., 2015). Alternatively, the authors show a direct effect on Fgf4 expression. The short time window of the reported repressive transcriptional effects (8h, Fig 2 middle), might suggest a direct regulation. The authors should test this possibility, and discuss what alternative modes of regulation could be envisaged (for instance, indirect effects mediated by Nanog). This is a key result that deserves a more detailed mechanistic characterisation.

      The regulation of FGF signaling by GATA factors has been pointed out as a central new result of our study by all three reviewers that we will be happy to further expand on in a revised manuscript. Regulation of Fgfr2 expression by GATA6 as suggested by the ChIP-seq data in Wamaitha et al., 2015 (PMID: 26109048) is one possible mechanistic explanation that we will of course discuss.

      Most importantly, we will test possible direct effects of GATA factors on Fgf4 expression that are indicated by the short timescales of the transcriptional effects shown in Fig. 2, as noted by the Reviewer. We have already mined the ChIP-seq data from Wamaitha et al., 2015 (PMID: 26109048) and found a GATA6-binding peak approximately 10 kb upstream of the Fgf4 start codon in a region that is highly enriched for GATA6 consensus binding sites. To test the functional role of this binding region, we propose to delete it by CRISPR-mediated mutagenesis in the inducible lines, and to test its ability to regulate reporter gene expression in heterologous assays.

      To address the question of alternative modes of regulation of Fgf signaling through NANOG, we have already performed in situ mRNA stainings for Fgf4 expression in cells grown for 40 h in N2B27 medium. While Nanog expression is much reduced under these conditions, Fgf4 mRNA continues to be expressed, indicating that positive regulation through NANOG is not essential for Fgf4 mRNA expression in ESCs. We will add this data to a revised manuscript, and discuss its implications for the regulation of Fgf4 transcription (see also our response to Reviewer #3 below). As a complementary approach to further test the role of indirect effects mediated through NANOG, we will dissect more closely the timing of Fgf4 downregulation reported in Fig. 2B relative to the upregulation of the inducible GATA4-mCherry protein and the downregulation of NANOG protein.

      **Minor points:**

      Fig S1: The authors should show quantifications of Nanog and GATA6 levels before the beginning of the differentiation protocol.

      We will be happy to add this data in a revised version, as part of a more extensive analysis of GATA4-mCherry and GATA6 expression at early stages of the differentiation protocol. See also our response to the next point.

      Line 106: The authors write "the initially large proportion of GATA6+; NANOG+ double positive cells". It appears that at 16h of differentiation ESCs have already partitioned between Gata6 or Nanog expressing cells. The authors should rephrase the sentence to reflect what seems to be an almost total absence of truly double positive cells. Possibly, an analysis conducted at earlier time points could clarify these dynamics.

      The Reviewer rightly points out that at 16 h of differentiation, most cells are already associated with one of two clusters in the NANOG/GATA6 expression space. The misleading classification of a large number of cells as double positive at 16 h was caused by applying a single gating strategy to the entire experiment, even though the mean expression levels of NANOG and GATA6 in the two clusters change significantly over time. We will update our gating strategy and rephrase this section to more appropriately describe cell clustering and gene expression dynamics over the time course. We will also extend Figure S1 with analysis of GATA6 and NANOG expression levels at earlier time points of the differentiation protocol, to test whether this allows detecting a truly double positive population.

      Line 124: The authors write "... concentration dependent downregulation of NANOG expression". The effects may rather depend on the time of doxycycline stimulation.

      We agree with the Reviewer that in isolation, the data shown in Fig. 1 and Fig. S2 leave open the possibility that the stronger downregulation of NANOG at higher GATA4-mCherry expression levels is caused by the extended time of doxycycline stimulation rather than GATA4-mCherry concentration. However, in our opinion, this concern is already addressed by the experiments performed in the four clonal lines with independent integrations shown in Figure S3. Here, the time of doxycycline induction is held constant, and a similar relationship between GATA4-mCherry and NANOG expression levels is observed as in the experiments where we modulate induction time in a single clonal line (compare Fig. S2A to Fig. S3B). In a revised version of the manuscript we will describe more clearly how the experiments shown in Figure S3 control for time-dependent effects of doxycycline stimulation.

      Line 192: The authors write "...and confined to cells with low GATA4-mCherry expression levels". It would be helpful to have an indication of the cell boundaries, possibly showing localisation of a membrane bound protein.

      We agree that more firmly establishing a correlation between GATA4-mCherry expression levels and Fgf4 mRNA expression in single cells would greatly benefit from co-staining with a plasma membrane marker. However, the protocol for mRNA in situ hybridization involves incubation steps with ethanol and formamide and is thus incompatible with staining for commonly used membrane markers. There is one commercially available membrane stain (CellBrite by Biotium) that promises to survive the treatments necessary for in situ hybridization and that we will try to use in our stainings. Should this not be successful, we will resort to identifying a subset of the cytoplasm corresponding to each nucleus by dilating nuclear masks that we will segment based on the DNA stain.

      It would be interesting for the authors to discuss how the spatial range of FGF activity measured in culture could affect PE specification in the embryo.

      During lineage specification in the embryo, Epi and PrE cells are initially arranged in a salt-and-pepper pattern (PMID: 16678776; PMID: 18725515; PMID: 30514631). In Fig. 4 and Fig. S9 of our manuscript, we show experimentally and theoretically how similar patterns in ESC colonies arise from the short range of FGF activity. In a revised version of the manuscript, we will discuss how the spatial range of FGF activity measured in culture provides a possible mechanistic explanation for the spatial arrangement of cell types in the embryo.

      Reviewer #1 (Significance (Required)):

      See above.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      In their manuscript entitled "Cell-cell communication through FGF4 generates and maintains robust proportions of differentiated cell types in embryonic stem cells" Raina et al study the effect of Fgf-signalling based local cell-cell communication for the establishment of PrE-like and Epi-like cells. The authors use an elegant, albeit artificial, system to analyse the effect of Fgf signalling on establishing 'normal' lineage proportions after transient induction of Gata4 expression. The main conclusions of the manuscript are: i) Gata6 positive cells emerge through short range Fgf4 based cell-cell cummunication. ii) Fgf4 signalling can compensate a wide range of initial levels of Gata6 expression and produce properly portioned cell identities. The authors also state that this mechanism could operate in a range of developing tissues.

      **Major points:**

      1. Fgf4 KOS ESCs are deficient in initiating epiblast lineage differentiation (Kunath 2007). Therefore, the effect studied by the authors might be multifactorial and the general inability of Fgf4 deficient cells to enter differentiation might contribute to the observed differentiation defects and defects of cell fate proportioning. Specifically, it could be expected that Nanog regulation is affected in Fgf4 mutants, although, to my knowledge, the specific phenotype of Fgf4 depletion has not been evaluated in Gata4 induced cell programming towards PrE. What steps have the authors taken to exclude an impact of general cell fate change defects in Fgf4 KO ESCs.

      While it is true that Fgf4 mutant cells have a general deficiency in initiating epiblast lineage differentiation, it was already shown in the original publication by Kunath et al. (PMID: 17660198) that general differentiation of Fgf4 mutant cells is restored to wild type levels by supplementing the culture medium with 5 ng/ml recombinant FGF4. This is a concentration that is well within the range of concentrations applied in our study. In initial experiments to characterize our Fgf4 mutant lines, we have measured NANOG expression to test the effectiveness of recombinant FGF4 to restore epiblast lineage differentiation. We found that FGF4 treatment of Fgf4 mutant cells in the absence of doxycycline induction leads to a downregulation of NANOG expression, to levels comparable to those seen in wild type cells grown in N2B27. These data indicate that treatment with recombinant FGF4 rescues defects of general cell fate change in Fgf4 KO ESCs. We will add these data to Figure S4 of a revised manuscript, and explicitly mention the function of recombinant FGF4 to rescue lineage differentiation potential more generally.

      Increasing the time of Gata4 expression results in increasing levels of Gata4 levels (Fig 1C). This is shown at the overall mean fluorescence level. However, it is important to also quantify how many cells do actually show some increase in Gata4 levels. Fig1D suggests that the number of Gata4 expressing cells is quite similar between 4h and 8h induction, but this needs to be quantified. An explanation for the apparent dosage independence of Gata4 could then be simple threshold effects, such that there is no additional effect of increased Gata4 levels in WT cells without any further requirement of feedback regulation after a certain threshold level of Gata4 is reached. Have the authors considered such a simple model?

      The current version of the manuscript already contains quantifications of GATA4-mCherry expression levels in single cells - see Fig. S2A for the experiments where we vary doxycycline induction time, and Fig. S3B for experiments with independent clonal lines. This analysis confirms the Reviewer’s visual impression of Fig. 1D - the number of GATA4-mCherry expressing cells is similar for different induction times and clonal lines, such that the increase in overall mean fluorescence levels is mainly due to an increase in GATA4-mCherry expression levels in single cells. This analysis therefore rules out the simple model based on threshold effects proposed by the Reviewer. In a revised version of the manuscript, we will more explicitly discuss the quantifications in Fig. S2A and Fig. S3B.

      An important point is that in the current setup distinguishing between dosage effects and effects of extended presence of Gata4 cannot be distinguished. Wouldn't titrating the amount of doxycycline used for induction be a more direct way to achieve different initial levels of Gata4 expression?

      This concern has also been raised by Reviewer #1, and is addressed in detail in our response to their comment above. Briefly, in our opinion this concern is addressed in the current manuscript by the experiments performed in the four clonal lines with independent integrations (Figure S3). Here, the duration of doxycycline induction and hence time of GATA4-mCherry exposure is held constant, such that the only difference between the conditions is GATA4-mCherry dosage. We will discuss this important function of Fig. S3 in a revised version of a manuscript.

      Unfortunately titrating doxycycline does not allow titrating transgene induction levels in a meaningful way, as sub-saturating doses of doxycycline lead to an increased heterogeneity in transgene expression with many non-expressing cells, rather than to reduced expression levels across all cells. See PMID: 17048983 for a possible explanation of this observation.

      Another point the authors should appropriately discuss and consider is that a lack of effect of different doses/durations of Gata4 expression could be due to the fact that by the time Gata6 is induced, the levels of Gata4 in cells previously treated for different periods of time are no longer detectably different. Such a regulation would equally result in indistinguishable cell fate proportioning. Can the authors exclude such a regulation? This is an important point at the heart of the authors conclusion.

      The Reviewer seems to suggest that by separating the initiation of GATA6 expression from the GATA4-mCherry pulse in time, the decision to initiate PrE-like differentiation could be independent from GATA4-mCherry concentration, thus explaining the robust cell type proportions. The data shown in Figs. S2C, S3D and Fig. 3 A - C clearly exclude such a regulation: In conditions where we supply recombinant FGF4, the proportions of the different cell types scale with GATA4-mCherry expression levels, indicating that GATA4-mCherry dose does indeed affect Gata6 expression. In a revised version of the manuscript we will discuss and consider how these observations argue against a model where the decision to initiate PrE-like differentiation occurs independently from GATA4-mCherry levels.

      The authors make some general statements on cell differentiation (e.g. l205). They also claim that the Fgf4-based mechanism of lineage proportioning could act in a range of tissues during development. However, the use of the term differentiation for the induction of PrE-identity (or Gata-factor expression to be exact, see comment below) after Gata4 overexpression is problematic. The system chosen by the authors is entirely artificial. ES cells normally do not differentiate into extraembryonic cell types. It needs to be made clear in the manuscript that they do not study a differentiation process that normally occurs in the embryo or in differentiating ESC cultures. The system the authors are using would, in my opinion, rather qualify as cell programming or transdifferentiation than as differentiation. I suggest presenting the system using clearer unambiguous language and to try to avoid any generalisations based on an artificial transgene-overexpression based system. The results have to be presented with this limitation in mind.

      To address the Reviewer’s concerns regarding terminology, we will expand on the relationship of our system to normal ESC differentiation and lineage specification in the embryo, and discuss its possible limitations. We disagree however with the Reviewer’s assertion that using a transgene-based overexpression system precludes drawing any general conclusions. Rather, the system allows mimicking Epi- and PrE-like differentiation in a uniquely accessible context, and thereby to exploit the molecularly simple regulation of this cell fate decision for studying basic principles of cell differentiation. This view is supported by Reviewer #3 in the referees cross-commenting section below, who emphasizes the value of such models and notes that they are very common in developmental biology.

      It is unclear how 'PrE-like' (as stated e.g. in the abstract) the cells really are after a short pulse of Gata4 expression. No proper characterisation has been performed but needs to be included, if the authors want to term these cells PrE-like.

      A recent study by Amadei et al. (PMID: 33378662) supports the notion that a short pulse of GATA4 expression can trigger bona fide PrE-like differentiation. In this study, the authors induced a similar doxycycline-inducible GATA4 expression system for 6 hours, and observed subsequent differentiation into several PrE derivatives, including the anterior visceral endoderm. In a revised version, we will cite this study to support our claim that the GATA6-positive cells are indeed PrE-like. Additionally, we offer to perform immunostainings with an extended panel of known PrE marker proteins to substantiate the PrE-like character of the GATA6-expressing cells.

      How is the statement in l112 that "The clear separation between the two populations suggests that the increase in the proportion of double negative cells at the expense of GATA6+; NANOG- PrE-like cells beyond 40 h is mostly fueled by the downregulation of NANOG expression in the GATA6-negative cell population, combined with a slower proliferation of the GATA6-positive population, rather than by the reversion of PrE-like into double negative cells." supported by the data?

      We realize from the comments of all three reviewers that this section was confusing and potentially misleading in the original version of the manuscript. In a revision, we will reword this paragraph to better bring out the major conclusions from the GATA6 and NANOG expression patterns shown in Fig. S1A. These data show that the majority of cells belong to one of two discrete clusters from 16 h onwards. The clear separation of the two clusters furthermore indicates that cells rarely switch their gene expression patterns. Given these observations, the changes of cell type proportions reported in Figure S1B can be explained as a consequence of slower proliferation of cells in the GATA6-positive relative to the GATA6-negative cluster. In addition, NANOG expression in the GATA6-negative cluster declines over time, such that progressively more cells are classified as double negative.

      Would the data and modelling performed by the authors be in line with a model in which the decision to express Gata6 is a stochastic choice (with a certain probability based on the levels of Gata4 induction) that is then stabilized and reinforced by Fgf signalling rather than Fgf signalling having an instructive role?

      The simulations shown for the Fgf4 mutant case in Fig. 3 D - G, right column, are based on a model in which the decision to express Gata is a stochastic choice with a probability based on the initial levels of GATA expression, and reinforced by FGF signaling. Thus, our data from the Fgf4 mutant, but not the wild type, are perfectly in line with such a model.

      We realize from the Reviewer’s comment that we have not made sufficiently clear the conceptual differences between the models for the mutant and the wild type case. We suspect that this lack of clarity stems from the fact that the two models rely on the same circuitry, except for the regulatory link between GATA and FGF. This link however makes a crucial difference: It transforms the simple single cell input-output model of the mutant case, which is common to many previous publications, into a population level model with cell-cell feedback which shows new emergent behavior. And only this population level model, but not the single cell model for the Fgf4 mutant, can recapitulate the experimental data observed in the wild type. In a revised version of the manuscript we will expand on these crucial differences when describing the model and data in Fig. 3.

      The statement in line 187 "This indicates that GATA4-mCherry expression negatively regulates FGF4 signaling during cell type specification." is not supported by the data. The authors show only a correlation and actually correctly say so in line 195.

      Prompted by the comments of both Reviewer #1 and #3, we will carry out experiments to mechanistically explore the regulation of Fgf4 expression by GATA factors (see our response to Reviewer #1 above for a detailed description). Depending on the outcome of these experiments we will reword this statement.

      In Fig 2F statistical analysis between the re-seeded conditions is required for the conclusion that "the proportion of PrE-like cells systematically increased with cell density". Replating itself appears to quite drastically impact lineage distribution. Do the authors have an explanation for this?

      The p-value in line 221 of the original manuscript refers to a test for a linear trend between the three conditions following a one-way ANOVA in GraphPad Prism. We apologize that this has not been made clear and will add this information in a revised version.

      The observation that replating drastically impacts lineage distribution is perfectly in line with the overall conclusion from this section, namely that FGF signaling is enhanced by cell-cell contacts. Replating strongly reduces the number of direct cell-cell contacts by disrupting the colony structure of the culture. Thus it is expected that the proportion of the PrE-like cells - which require exposure to FGF ligands - is reduced under these conditions compared to the condition that has not been replated. We will discuss this explanation in a revision.

      Fig 2G shows a key experiment illustrating the local effect of Fgf4 expression on first and second neighbours. The authors have investigated this effect using a Fgf-signalling reporter. Why did they not assay Gata6 expression in this assay instead of a Spry reporter? This would be the experiment to show that also Gata6 expressing cells (after transient Gata4 induction) are clustered around Fgf4 producing cells and be a strong piece of evidence to show that local Fgf4 signalling and cell-cell communication is indeed involved in cell identity proportioning. The cell lines required for this experiment (including Fgf4 mutant Gata4 inducible ESCs) appear to be available.

      We decided to measure the FGF4 signaling range with a Spry4:H2B-Venus reporter because its response time is faster than that of GATA6 expression during differentiation. Furthermore, the Spry4:H2B-Venus reporter provides a quantitative readout for FGF4 signaling, in contrast to a binary read-out that would be expected for GATA6 expression. We will be happy to discuss these considerations in a revised manuscript.

      We agree that measuring FGF4 signaling range with Fgf4 mutant Gata4-mCherry inducible cells as suggested by the Reviewer constitutes a complementary approach to further corroborate the role of local FGF4 signaling in cell differentiation. However, we would like to stress that our demonstration of local FGF4 signaling is already supported by two fully orthogonal quantitative experiments, one relying on cell replating and the other one relying on the signalling reporter. The concept of local signaling is further supported by our quantitative analysis of the spatial arrangement of cell types in Fig. 4. The additional experiment suggested by the Reviewer is therefore unlikely to substantially change the paper’s conclusions, as also pointed out by Reviewer #3 in the referees cross-commenting section. Therefore, we offer to perform this experiment for a revision, but would like to seek the editor’s opinion if this is deemed necessary to make the paper acceptable for publication.

      The authors conclude from data in Fig 3A that proper cell type proportioning depends on initial Gata4 levels in Fgf4 mutants, in contrast to WT cells where the initial levels appear more irrelevant. Is 10ng/ml too high a dose? Would using a lower concentration (such as ~2ng/ml suggested by Fig 2D to give WT-like distribution) result in a complete rescue of cell lineage proportioning in this assay? Formally a control of adding additional Fgf4 to WT cells will also ne needed to control for a potential effect of exogenous Fgf4 addition.

      In our initial characterization of the Fgf4 mutant cell lines, we have performed experiments where we examined cell type proportions upon culture in the presence of different doses of FGF4 following doxycycline induction times between 1 h and 8 h. These experiments confirm the suspicion of the Reviewer that cell type proportions similar to the wild type can be obtained with a lower dose of 2.5 ng/ml FGF4 after 8 h of induction. For shorter induction times followed by differentiation in the presence of 2.5 ng/ml FGF4 however, cell type proportions were strongly skewed towards Epiblast-like cells. These data thus further support the major conclusion from Fig. 3A quoted by the Reviewer: Proper cell type proportioning in Fgf4 mutants depends on GATA4 levels, and this behavior is independent from the FGF4 concentration applied. We offer to add this data to a revised manuscript.

      The effects of adding FGF4 to wild type cells are shown in Fig. S2C and S3D in the current version of the manuscript. This control has been performed in all experiments shown in Fig. 3A - C, but we decided to omit it for clarity. We are happy to add this information back in as requested by the Reviewer.

      Does the model in Fig 3E consider potentially varying doses of exogenous Fgf4? Can the model also predict what happens if Fgf4 is added to WT cells, as suggested above as control? In general, the value of this model is unclear. Figure 3E is near impossible to understand, no quantitative information is given.

      The model in Fig. 3E can of course be simulated with different doses of exogenous FGF4. These simulations recapitulate the experimental results described under point 10 above: Cell type proportions for the Fgf4 mutant case are skewed towards NANOG-positive cells at lower FGF4 doses, and vary with initial conditions irrespective of FGF4 dose. We offer to show the results of these simulations in a revised manuscript alongside the experimental data discussed above.

      It is also possible to incorporate into the model addition of exogenous FGF4 to the wild type. Simulations of this condition confirm the experimentally observed increase in PrE-like cells shown in Fig. S2C and S3D of the current manuscript.

      To help the reader digest Fig. 3E, we will add separating lines similar to the gates of the flow cytometry data in panel A, and indicate the proportion of cells in the respective quadrants.

      The Reviewer’s comment that the value of the model is unclear indicates to us that we have not explained in sufficient detail the conceptual differences between the behavior of the model of the wild type and the mutant case. As detailed in our response to Reviewer’s comment 6. above, we will rewrite the text to bring out more clearly the insight that the model brings.

      Fig4A: What were WT and Fgf4 mutant cells treated differently in this assay (8h vs 4h, respectively)?

      The spatial arrangement of cell types in Fgf4 mutant cells has been assayed in two conditions that give similar cell type proportions as seen in the wild type, as motivated in lines 366 - 370 of the current manuscript. We decided to show the condition with 4 h induction followed by differentiation in the presence of 10 ng/ml FGF4 in the main Figure 4 because it is most similar to the condition that gives wild-type like cell type proportions in the Fgf4 mutant shown in the immediately preceding main Figure 3, while the condition that uses 8 h induction followed by differentiation in the presence of 2.5 ng/ml FGF4 refers back to the main Figure 2. We show both primary data and the complete analysis for the latter condition in Figures S8D and S10. Fig. S10 provides a direct comparison between the two conditions and clearly demonstrates that they show similar dynamics. We do not think that exchanging the two datasets between main and supplementary Figures will add value to the manuscript.

      Does the interpretation that at 24h there is a difference in Fig 4C survive statistical scrutiny? Only few datapoints are shown and any apparent differences seem due to outliers rather than a shift in cluster radii. How often were these experiments independently repeated? This information is missing. In Fig 4B, I cannot appreciate any difference between cell lines.

      We will perform statistical testing to assess whether the spatial arrangement of cell types is significantly different between the time points, and mention the results in the text.

      To evaluate the spatial arrangement of cell types, we have performed two independent experiments in the wild type, and analyzed two conditions for the mutant case. In each experiment, we have analyzed at least eight positions per condition and control. Spatial clustering of wild type cells at 40 h is also observed in earlier Figures in the manuscript (e.g. Fig. 1D, S2B, S3C).

      The similarities between wild type and Fgf4 mutant cells shown in Fig. 4B are not surprising and fully in line with the data shown in panel C, which shows that differences between time points are much more pronounced compared to the differences between genotypes. However, we realize that the micrographs and analysis plots in Fig. 4A and B were perhaps not fully representative for the aggregate behavior shown in panel C. In a revision, we will therefore show data from more representative colonies in panels A and B.

      **Minor points:**

      a) More information on statistics should be given in the Figures and legends.

      To address this concern we will perform statistical tests for differences in proportions of the main cell types in Figures 1D and 3C. In addition, we will perform statistical testing on Fig. 4C as detailed in point 13 above.

      b) Percentages should be indicated in the quadrants of the FACS plots of Fig 3A and E.

      This is a good suggestion, we will add this information. See also our response to point 11 above.

      c) What is the underlying evidence for the statement: "The specification of Epi- and PrE-like cells in ESCs shows both molecular and functional parallels to the patterning of the ICM of the mouse preimplantation embryo."

      In the current manuscript, this statement is further substantiated in the subsequent paragraph (lines 483 - 503). We realize that this order is potentially confusing and will change it. We will further modify this section as part of our response to major point 3. above.

      d) Fig 5C is difficult to interpret without a comprehensive decoding of colour information.

      To facilitate interpretation of this panel, we will add a legend to decode the colour information of the traces (purple: VNPhigh, cyan: VNPlow)

      Reviewer #2 (Significance (Required)):

      This manuscript provides novel insights into the role of Fgf-mediated cell-cell communication to establish proper ratios of cell identities in a PrE-induction system. The authors provide some interesting data and interpretation. Overall, the significance is slightly impaired by the highly artificial nature of the studied cell fate specification event.

      This manuscript will be of interest to readers working on early embryonic cell fate decision as well as researchers working on modelling of cellular processes.

      My expertise lies in the field of cell fate decision and pluripotency.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      It is well established that FGF signalling plays a role in the partitioning of the Primitive Endoderm and Epiblast fates during preimplantation mammalian development. Recent work has shown that this fate decisions is associated with a mechanism that is able to maintain the proportions of the two fates stable in the face of perturbations. Here, the authors address this mechanism and show that it is dependent on FGF signalling and associated with the fate decision. In the process they suggest and test a novel mechanism based on short range FGF signalling. A series of carefully designed and executed experiments, refine and provide evidence for the model. This is an original and important piece of work that will influence the field of pattern formation.

      Overall the manuscript is well written but, at least from the perspective of this reviewer, there are places in which clarity can be improved.

      Lines 104 and ff: the description of the dynamics of the different populations fater the GATA4 pulse, can be clarified. The reference to the double negative population emerging from the PrEnd population is not clear. It is stated that the proportion of these cells increased continuously and it said to be at the expense of the decrease of the PrEnd population whose variation is referred to as 'slightly declined". How can a slight decline fuel a steady increase in the double negative?

      Also, what are these double negative? Could they be cells differentiating into embryonic lineages?

      We realize from the comments of all three Reviewers on this paragraph that it was confusing and potentially misleading in the original manuscript. In a revised version we will rewrite this section to clarify our interpretation of the data in Fig. S1. First, the clear separation of the two clusters observed in NANOG-GATA6 expression space indicates that cells rarely switch between the two clusters. Then, a likely explanation for the slow decline in the fraction of GATA6-positive cells is a slower proliferation compared to the GATA6-negative cells. Third, the increase in the proportion of double negative cells is caused by a progressive downregulation of NANOG expression in the GATA6-negative cluster. These NANOG expression dynamics are consistent with NANOG expression dynamics in epiblast cells of the embryo, and could indeed indicate differentiation towards embryonic lineages. We will mention this possibility in a revised manuscript.

      See also our response to Reviewer #1 and Reviewer #2, point 5..

      In Figure 1 and its discussion, it would be good to see a representation of the stability of the final proportions relative to the different initial conditions, a variation on 1E.

      This is a good suggestion. In a revised version, we plan to add a panel to Fig. 1 in which we plot the final proportions of the different lineages versus the GATA4-mCherry expression levels for the different induction times. This will illustrate more clearly that the final proportions of cell types are largely independent from the initial conditions.

      Paragraph lines 182 and ff: the report that GATA4 expression is able to suppress FGF4 signalling, autonomously is, at least for this reviewer, a novel and important result and one that impinges on the understanding of the process. The authors should emphasize this.

      We agree with the Reviewer that the direct regulation of Fgf4 expression through GATA factors is a new regulatory link suggested by our data that has not been described before and that is crucial for the functioning of the system. Prompted by a similar comment of Reviewer #1 above, we offer to further explore the mechanistic basis of this link through an analysis of published ChIPseq data, functional studies of a GATA binding site upstream of the Fgf4 start codon, or a more detailed temporal dissection of NANOG, GATA and Fgf4 expression dynamics following doxycycline induction (see our response to Reviewer #1 above for more details). These new experiments and analyses will allow us to emphasize this novel result, and thereby significantly strengthen our paper.

      Paragraph lines 274 and ff (section on the involvement of FGF4 in the robustness of the process) needs some explanations. The derivation of the conclusion that 'recursive communication vis FGF4 underlies a population-level phenotype ...characterized by the differentiation of robust proportions of cell types..." from the experiments requires some unwrapping. It would be helpful if the authors could reason how the conclusion follows from the experiments.

      We realize from this Reviewer’s comment and the comments of Reviewer #2 above that we have not explained well enough how the results shown in Fig. 3 A-C (lines 274 - 283) lead to our conclusion of emergent behavior, which are then further substantiated in the modelling in panels D - G. The central conclusion of this paragraph rests on the observation that cell type proportions are dependent on initial conditions in the Fgf4 mutant, but not in wild type cells. As we had supplied FGF4 externally to the Fgf4 mutant cells, the only difference between these two conditions is that FGF4 dose in wild type cells is regulated by the cell population, i.e. cells can communicate via FGF4, whereas mutant cells cannot. We will expand on this line of reasoning, and also explain in more detail the differences in the models for the mutant case and the wild type, which we believe will help to conceptualize the experimental results. See also our response to Reviewer #2, points 6. and 11..

      Their model does not seem to include the commonly agreed regulatory interaction between Nanog and FGF4, at least not directly, and it would be helpful if a reasoning could be provided for this decision.

      A discussion of the regulatory interaction between NANOG and Fgf4 has also been requested by Reviewer #1. In our response to their point above, we provide a reasoning why we have omitted it in the current manuscript. Briefly, our decision not to include a direct positive link between NANOG and Fgf4 expression rests on our observation that Fgf4 mRNA continues to be expressed 2 days after switching cells from 2i + LIF medium to N2B27, a time at which NANOG already starts to be downregulated as a consequence of differentiation along embryonic lineages. We will add this data to a revised manuscript. In addition, we propose above to dissect in more detail the temporal sequence of GATA4-mCherry, Fgf4 and NANOG expression upon doxycycline induction. This analysis will provide further information about the role of NANOG for Fgf4 mRNA expression in ESCs.

      Reviewer #3 (Significance (Required)):

      In this manuscript, Raina and colleagues use an Embryonic Stem (ES) cell based experimental system to address a central problem in developmental biology, namely the emergence of stable scaled populations of different cell fates. The experiments are elegant in design, carefully executed and the effort provides a solution to the problem: a novel mechanism based on short range FGF signalling that provides homeostatic control of relative cell populations. This is an important piece of work with sound conclusions that establishes a new paradigm in pattern formation whose implications are likely to lead to a reassessment of the role of FGF in different patterning paradigms. The experiments are quantitative and supported by a modelling effort based on a theoretical piece of work (Stanoev et al. 2021) which underpins the conclusion.

      This manuscript will appeal to a wide audience including developmental and stem cell biologists as well as modellers.

      My expertise cover the areas addressed in the manuscript.

      **Referees cross-commenting**

      It looks as if, with some nuances, we all agree on the value of the work. I do not have any issues with the comments of Reviewer 1, though I disagree that the model tested and improved here is similar to existing ones. While it is true that this work is related to a theory paper by some of the authors, the experimental test and resulting conclusions are very important. On the other hand, I am very surprised by the comments of Reviewer 2 who, after conceding the value and potential significance of the work, raises a list of queries, largely small details and opinions rather than points of substantial concerns, hinting at a need for the authors to perform extra work and analysis that will not change the conclusions of the manuscript. Some of this e.g. #9 would be a nice piece of additional evidence, but more an adornment than a necessary piece of additional evidence. The main problem of this reviewer is the lack of appreciation of what they define as 'highly artificial nature' of the study without providing any reason for why such experiments (very common in developmental biology) can lead to misleading conclusions. It seems to me that most, if not all, of their significant concerns can be dealt with in a rebuttal or by altering the text, either to discuss the issues raised, to clarify the points or qualify the conclusions.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      It is well established that FGF signalling plays a role in the partitioning of the Primitive Endoderm and Epiblast fates during preimplantation mammalian development. Recent work has shown that this fate decisions is associated with a mechanism that is able to maintain the proportions of the two fates stable in the face of perturbations. Here, the authors address this mechanism and show that it is dependent on FGF signalling and associated with the fate decision. In the process they suggest and test a novel mechanism based on short range FGF signalling. A series of carefully designed and executed experiments, refine and provide evidence for the model. This is an original and important piece of work that will influence the field of pattern formation.

      Overall the manuscript is well written but, at least from the perspective of this reviewer, there are places in which clarity can be improved.

      Lines 104 and ff: the description of the dynamics of the different populations fater the GATA4 pulse, can be clarified. The reference to the double negative population emerging from the PrEnd population is not clear. It is stated that the proportion of these cells increased continuously and it said to be at the expense of the decrease of the PrEnd population whose variation is referred to as 'slightly declined". How can a slight decline fuel a steady increase in the double negative?

      Also, what are these double negative? Could they be cells differentiating into embryonic lineages?

      In Figure 1 and its discussion, it would be good to see a representation of the stability of the final proportions relative to the different initial conditions, a variation on 1E.

      Paragraph lines 182 and ff: the report that GATA4 expression is able to suppress FGF4 signalling, autonomously is, at least for this reviewer, a novel and important result and one that impinges on the understanding of the process. The authors should emphasize this.

      Paragraph lines 274 and ff (section on the involvement of FGF4 in the robustness of the process) needs some explanations. The derivation of the conclusion that 'recursive communication vis FGF4 underlies a population-level phenotype ...characterized by the differentiation of robust proportions of cell types..." from the experiments requires some unwrapping. It would be helpful if the authors could reason how the conclusion follows from the experiments.

      Their model does not seem to include the commonly agreed regulatory interaction between Nanog and FGF4, at least not directly, and it would be helpful if a reasoning could be provided for this decision.

      Significance

      In this manuscript, Raina and colleagues use an Embryonic Stem (ES) cell based experimental system to address a central problem in developmental biology, namely the emergence of stable scaled populations of different cell fates. The experiments are elegant in design, carefully executed and the effort provides a solution to the problem: a novel mechanism based on short range FGF signalling that provides homeostatic control of relative cell populations. This is an important piece of work with sound conclusions that establishes a new paradigm in pattern formation whose implications are likely to lead to a reassessment of the role of FGF in different patterning paradigms. The experiments are quantitative and supported by a modelling effort based on a theoretical piece of work (Stanoev et al. 2021) which underpins the conclusion.

      This manuscript will appeal to a wide audience including developmental and stem cell biologists as well as modellers.

      My expertise cover the areas addressed in the manuscript.

      Referees cross-commenting

      It looks as if, with some nuances, we all agree on the value of the work. I do not have any issues with the comments of Reviewer 1, though I disagree that the model tested and improved here is similar to existing ones. While it is true that this work is related to a theory paper by some of the authors, the experimental test and resulting conclusions are very important. On the other hand, I am very surprised by the comments of Reviewer 2 who, after conceding the value and potential significance of the work, raises a list of queries, largely small details and opinions rather than points of substantial concerns, hinting at a need for the authors to perform extra work and analysis that will not change the conclusions of the manuscript. Some of this e.g. #9 would be a nice piece of additional evidence, but more an adornment than a necessary piece of additional evidence. The main problem of this reviewer is the lack of appreciation of what they define as 'highly artificial nature' of the study without providing any reason for why such experiments (very common in developmental biology) can lead to misleading conclusions. It seems to me that most, if not all, of their significant concerns can be dealt with in a rebuttal or by altering the text, either to discuss the issues raised, to clarify the points or qualify the conclusions.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      In their manuscript entitled "Cell-cell communication through FGF4 generates and maintains robust proportions of differentiated cell types in embryonic stem cells" Raina et al study the effect of Fgf-signalling based local cell-cell communication for the establishment of PrE-like and Epi-like cells. The authors use an elegant, albeit artificial, system to analyse the effect of Fgf signalling on establishing 'normal' lineage proportions after transient induction of Gata4 expression. The main conclusions of the manuscript are: i) Gata6 positive cells emerge through short range Fgf4 based cell-cell cummunication. ii) Fgf4 signalling can compensate a wide range of initial levels of Gata6 expression and produce properly portioned cell identities. The authors also state that this mechanism could operate in a range of developing tissues.

      Major points:

      1. Fgf4 KOS ESCs are deficient in initiating epiblast lineage differentiation (Kunath 2007). Therefore, the effect studied by the authors might be multifactorial and the general inability of Fgf4 deficient cells to enter differentiation might contribute to the observed differentiation defects and defects of cell fate proportioning. Specifically, it could be expected that Nanog regulation is affected in Fgf4 mutants, although, to my knowledge, the specific phenotype of Fgf4 depletion has not been evaluated in Gata4 induced cell programming towards PrE. What steps have the authors taken to exclude an impact of general cell fate change defects in Fgf4 KO ESCs.
      2. Increasing the time of Gata4 expression results in increasing levels of Gata4 levels (Fig 1C). This is shown at the overall mean fluorescence level. However, it is important to also quantify how many cells do actually show some increase in Gata4 levels. Fig1D suggests that the number of Gata4 expressing cells is quite similar between 4h and 8h induction, but this needs to be quantified. An explanation for the apparent dosage independence of Gata4 could then be simple threshold effects, such that there is no additional effect of increased Gata4 levels in WT cells without any further requirement of feedback regulation after a certain threshold level of Gata4 is reached. Have the authors considered such a simple model? An important point is that in the current setup distinguishing between dosage effects and effects of extended presence of Gata4 cannot be distinguished. Wouldn't titrating the amount of doxycycline used for induction be a more direct way to achieve different initial levels of Gata4 expression? Another point the authors should appropriately discuss and consider is that a lack of effect of different doses/durations of Gata4 expression could be due to the fact that by the time Gata6 is induced, the levels of Gata4 in cells previously treated for different periods of time are no longer detectably different. Such a regulation would equally result in indistinguishable cell fate proportioning. Can the authors exclude such a regulation? This is an important point at the heart of the authors conclusion.
      3. The authors make some general statements on cell differentiation (e.g. l205). They also claim that the Fgf4-based mechanism of lineage proportioning could act in a range of tissues during development. However, the use of the term differentiation for the induction of PrE-identity (or Gata-factor expression to be exact, see comment below) after Gata4 overexpression is problematic. The system chosen by the authors is entirely artificial. ES cells normally do not differentiate into extraembryonic cell types. It needs to be made clear in the manuscript that they do not study a differentiation process that normally occurs in the embryo or in differentiating ESC cultures. The system the authors are using would, in my opinion, rather qualify as cell programming or transdifferentiation than as differentiation. I suggest presenting the system using clearer unambiguous language and to try to avoid any generalisations based on an artificial transgene-overexpression based system. The results have to be presented with this limitation in mind.
      4. It is unclear how 'PrE-like' (as stated e.g. in the abstract) the cells really are after a short pulse of Gata4 expression. No proper characterisation has been performed but needs to be included, if the authors want to term these cells PrE-like.
      5. How is the statement in l112 that "The clear separation between the two populations suggests that the increase in the proportion of double negative cells at the expense of GATA6+; NANOG- PrE-like cells beyond 40 h is mostly fueled by the downregulation of NANOG expression in the GATA6-negative cell population, combined with a slower proliferation of the GATA6-positive population, rather than by the reversion of PrE-like into double negative cells." supported by the data?
      6. Would the data and modelling performed by the authors be in line with a model in which the decision to express Gata6 is a stochastic choice (with a certain probability based on the levels of Gata4 induction) that is then stabilized and reinforced by Fgf signalling rather than Fgf signalling having an instructive role?
      7. The statement in line 187 "This indicates that GATA4-mCherry expression negatively regulates FGF4 signaling during cell type specification." is not supported by the data. The authors show only a correlation and actually correctly say so in line 195.
      8. In Fig 2F statistical analysis between the re-seeded conditions is required for the conclusion that "the proportion of PrE-like cells systematically increased with cell density". Replating itself appears to quite drastically impact lineage distribution. Do the authors have an explanation for this?
      9. Fig 2G shows a key experiment illustrating the local effect of Fgf4 expression on first and second neighbours. The authors have investigated this effect using a Fgf-signalling reporter. Why did they not assay Gata6 expression in this assay instead of a Spry reporter? This would be the experiment to show that also Gata6 expressing cells (after transient Gata4 induction) are clustered around Fgf4 producing cells and be a strong piece of evidence to show that local Fgf4 signalling and cell-cell communication is indeed involved in cell identity proportioning. The cell lines required for this experiment (including Fgf4 mutant Gata4 inducible ESCs) appear to be available.
      10. The authors conclude from data in Fig 3A that proper cell type proportioning depends on initial Gata4 levels in Fgf4 mutants, in contrast to WT cells where the initial levels appear more irrelevant. Is 10ng/ml too high a dose? Would using a lower concentration (such as ~2ng/ml suggested by Fig 2D to give WT-like distribution) result in a complete rescue of cell lineage proportioning in this assay? Formally a control of adding additional Fgf4 to WT cells will also ne needed to control for a potential effect of exogenous Fgf4 addition.
      11. Does the model in Fig 3E consider potentially varying doses of exogenous Fgf4? Can the model also predict what happens if Fgf4 is added to WT cells, as suggested above as control? In general, the value of this model is unclear. Figure 3E is near impossible to understand, no quantitative information is given.
      12. Fig4A: What were WT and Fgf4 mutant cells treated differently in this assay (8h vs 4h, respectively)?
      13. Does the interpretation that at 24h there is a difference in Fig 4C survive statistical scrutiny? Only few datapoints are shown and any apparent differences seem due to outliers rather than a shift in cluster radii. How often were these experiments independently repeated? This information is missing. In Fig 4B, I cannot appreciate any difference between cell lines.

      Minor points:

      a) More information on statistics should be given in the Figures and legends.

      b) Percentages should be indicated in the quadrants of the FACS plots of Fig 3A and E.

      c) What is the underlying evidence for the statement: "The specification of Epi- and PrE-like cells in ESCs shows both molecular and functional parallels to the patterning of the ICM of the mouse preimplantation embryo."

      d) Fig 5C is difficult to interpret without a comprehensive decoding of colour information.

      Significance

      This manuscript provides novel insights into the role of Fgf-mediated cell-cell communication to establish proper ratios of cell identities in a PrE-induction system. The authors provide some interesting data and interpretation. Overall, the significance is slightly impaired by the highly artificial nature of the studied cell fate specification event.

      This manuscript will be of interest to readers working on early embryonic cell fate decision as well as researchers working on modelling of cellular processes.

      My expertise lies in the field of cell fate decision and pluripotency.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      In this manuscript Raina et al. use an in vitro model of PE specification based on the transient overexpression of GATA4 in ESCs to show that the acquisition of primitive endoderm (PE) identity is governed at the population levels by cell-cell interactions mediated by FGF signaling. The authors further argue that the specification of a defined proportion of "PE" and "Epiblast" cells in a differentiating population of ESC is an emergent property of a system where paracrine signaling shifts the balance between two alternative stable states. Overall, the work does not reach radically new conclusions: broadly similar models are outlined in several other publications, including from the authors. Yet this study makes use of elegant genetic models and is particularly well executed. In addition, it includes a very accurate characterisation of the spatial range of FGF signaling activity that is original and adds on the existing knowledge. Moreover, the authors show novel evidence suggesting that GATA factors inhibits Fgf4 transcription and the activity of the FGF signaling pathway in ESCs.

      Two major points deserve further clarification:

      In this manuscript the authors claim that the proportions of cells acquiring PE fate is, at least in the experimental setup adopted, largely independent from the levels of GATA4 induction, and therefore of the initial state of the gene regulatory network regulating this cell fate transition. However, the authors should discuss how the current findings relate to their previous results, showing that the duration/levels of Gata4 induction, in a similar experimental setting, play an important role in determining the final proportion of cells cell acquiring "PE" fate. Absolute expression levels may be crucial for this distinction, but the authors seem to exclude this possibility (see figure S3).

      Most importantly, the authors incorporate in their model the notion that GATA6 inhibits FGF signaling. It would be interesting to understand how such inhibition is mechanistically mediated. For instance GATA6 has been shown to bind in proximity of the Fgfr2 gene (Wamaitha et al., Genes and Dev., 2015). Alternatively, the authors show a direct effect on Fgf4 expression. The short time window of the reported repressive transcriptional effects (8h, Fig 2 middle), might suggest a direct regulation. The authors should test this possibility, and discuss what alternative modes of regulation could be envisaged (for instance, indirect effects mediated by Nanog). This is a key result that deserves a more detailed mechanistic characterisation.

      Minor points:

      Fig S1: The authors should show quantifications of Nanog and GATA6 levels before the beginning of the differentiation protocol.

      Line 106: The authors write "the initially large proportion of GATA6+; NANOG+ double positive cells". It appears that at 16h of differentiation ESCs have already partitioned between Gata6 or Nanog expressing cells. The authors should rephrase the sentence to reflect what seems to be an almost total absence of truly double positive cells. Possibly, an analysis conducted at earlier time points could clarify these dynamics.

      Line 124: The authors write "... concentration dependent downregulation of NANOG expression". The effects may rather depend on the time of doxycycline stimulation.

      Line 192: The authors write "...and confined to cells with low GATA4-mCherry expression levels". It would be helpful to have an indication of the cell boundaries, possibly showing localisation of a membrane bound protein.

      It would be interesting for the authors to discuss how the spatial range of FGF activity measured in culture could affect PE specification in the embryo.

      Significance

      See above.

  2. Apr 2021
    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      Review of "Co-chaperone involvement in knob biogenesis implicates host-derived chaperones in malaria virulence." by Diehl et al for Review Commons.


      **Major Comments.** __

      1. In this paper the function of Plasmodium falciparum exported protein PFA66, is investigated by replacing its functionally important dnaJ region with GFP. These modified parasites grew fine but produced elongated knob-like structures, called mentulae, at the surface of the parasites infected RBCs. Knobs are elevated platforms formed by exported parasite proteins at the surface of the infected RBC that are used to display PfEMP1 cytoadherance proteins which help the parasites avoid host immunity. The mentulae still display some PfEMP1 and contain exported proteins such as KAHRP but can no longer facilitate cytoadherence. Complementation of the truncated PFA66 with full length protein restored normal knob morphology however complementation with a non-functional HPD to QPD mutant did not restore normal morphology implying interaction of the PFA66 with a HSP70 possibly of host origin is important for function. While a circumstantial case is made for PFA66 interacting with human HSP70 rather than parasite HSP70-x, is there any direct evidence for this eg, protein binding evidence? I feel that without some additional evidence for a direct interaction between PFA66 and human HSP70 then the paper's title is a little misleading.

        We thank the reviewer for their kind words. They are correct that we do not show direct evidence of such an interaction, but would like to note that we, and others, despite concerted efforts to produce direct evidence, have always been hindered by the nature of the experimental system. As noted also in our reply to Reviewer 3, the inability to genetically modify the host cell leads us to suggest that indirect evidence is the best that can conceivably be provided at this time. Our evidence, although indirect, is the first experimental evidence for the importance of such an interaction, all other suggestions having been based on “guilt by association” i.e. protein localisation or co-IP analyses.

      Was CSA binding restored upon complementation of ∆PFA with the full-size copy of PFA66?

      As this project grew organically and was driven by the results already obtained, we decided to use knob morphology via SEM as a “proof-of-principle” to show that we could reverse the phenotype. Thus, while we cannot comment on whether ALL functions of PFA66 are complemented, we suspect that if the knobs revert to their WT morphology, this is likely to be true for the other tested phenotypes. We do not feel that revisiting all of our assays (which would basically entail repeating almost every experiment so far carried out) would really be much more informative. We have added a note in the discussion stating “We wish to note that we cannot unequivocally state that our complementation construct allows reversion of all the aberrant phenotypes herein investigated, however we feel it likely that all abnormal phenotypes are linked and thus our “proof-of-principle” investigation of knob/eKnob phenotypes is likely to be reflected in other facets of host cell modification and can thus be seen as a proxy for such.”.

      **Minor Comments**

      Line 36, NPP should be NPPs if referring to the plural.


      Changed


      Line 37, MC should be MCs if referring to the plural. By the way this acronym is never used in the text, it's always written 'Maurer's clefts'.

      Changed

      Abstract, Line 52-53, could be changed to "uncover a new KAHRP-independent..." as it currently implies (albeit weakly) that that this is the first observation of a KAHRP-independent mechanism for correct knob biogenesis. Maier et al 2008, have previously shown that knock out of PF3D7_1039100 (J-domain exported protein), greatly reduced knob size and knock out of PHISTb protein PF3D7_0424600, resulted in knobless parasites.

      Correct. In line with the suggestions of another reviewer, this section has been changed.

      In the Abstract it is mentioned that "Our observations open up exciting new avenues for the development of new anti-malarials." This is never really expanded upon in the rest of the paper and so seems like a bit of a throwaway line and could be left out.

      Good point, changed

      Line 59, WHO world malaria report should be cited here since these numbers are from the report not a paper from 2002.

      Done

      Line 67, Marti et al 2004 should be cited here as its published at the same time as Hiller et al 2004.

      Our mistake. Done

      Line 76, I suggest using either 'erythrocyte' or 'red blood cell' throughout the text not both.

      We now use erythrocyte throughout

      Line 80, Maier et al 2008 should be referenced here.

      Done

      Line 87, the authors should cite Birnbaum et al 2017 for the technique used. This is cited immediately after (line 98) in the results section but could be addressed at both points in the text.

      Done

      Line 123, IFAs and live cell imaging failed to detect the PFA-GFP protein and the author proposes this is due to low expression levels. However, PFA66 is expressed at ~350 FPKM in the ring stage and previous studies from your own group have visualised it using GFP before. Is there another explanation for this such as disruption of the locus here has served to greatly reduce the expression level of the fusion protein?

      The truncated protein is now distributed throughout the whole erythrocyte cytosol, not concentrated into J-dots, likely making detection difficult. We wish to note that our original GFP tagged PFA66 lines (Külzer et al, 2010) did not really show a strong signal in comparison to other lines we are used to analysing. We further believe that the sub-cellular fractionation (Figure S1) demonstrates the erythrocyte cytosolic localization of the truncated PFA66. We have no evidence that truncation causes lower expression, but any future revision will include a comparison of expression levels of endogenously GFP tagged dPFA and PFA66.

      Line 147, for consistency it would be best to introduce infected red blood cell (iRBC) at the beginning of the main text and use throughout the text instead of switching between 'infected human erythrocyte' and iRBC.

      We agree, and have changed accordingly

      Line 153, Fig S2A does not exist.

      We apologise, this has been changed

      Lines 156-158: Different knob morphologies are described with repeated reference to Fig2 and FigS2. Since multiple whole-cell SEM images are displayed in these figures it would be worth adding lettering and/or zoomed-in regions of interest highlighting examples of each aberrant knob type.


      This has now been added to Figure S2.

      Line 178-179, "Although not highly abundant in either sample, the morphology of Maurer's clefts appeared comparable in both samples (data not shown)." Why is the data not shown? Representative images of Maurer's clefts from each line should be included in the supplementary figures or this in-text statement should more clearly justified.

      Figure S3 has been adjusted to also show Maurer´s clefts in more detail. An Excel table of Data can be provided if necessary.

      Line 196, indirect immunofluorescence assay (IFA).


      Changed

      Line 201, how was the 'non-significant difference' measured? PHISTc looks quite different by eye. Rephrase the term "significant difference" as localisation of these exported proteins was compared visually rather than quantified. Otherwise, a measure of mean fluorescence intensity could be taken for each protein as a basic comparison between the two lines. In the Figure legend of S4, the term "no drastic difference", is used suggesting this was not quantified. By the way, PHISTc appears different by the represented figure.

      We apologise for our use of a specific term for non-statistically verified observations. The PHISTc image the reviewer comments on, was presented incorrectly (too much brightness introduced during processing) and is now correct. We mean to say that we could not (in a blinded check), tell the difference between WT and KO IFA images. Only KAHRP (in our opinion) demonstrated a different fluorescence pattern. As KAHRP has previously been implicated in knob formation, we then analysed this phenotype in more detail. A detailed analysis of the fluorescence pattern in the other IFAs does, in our eyes, not add to the story or add any real value to our observations.

      Line 213, you now have 3 versions for the word wild type, 'wild type', 'wild-type' and 'WT', best to choose one for consistency.

      Changed

      Line 232, 'tubelike' to 'tube-like'.

      Changed

      Line 279, just use 'IFA', the acronym has already been explained earlier in the text.

      Changed

      Line 319, 'permeation' should be 'permeability'.

      Changed

      Line 353, 'The action of host actin is known' to 'Host actin is known'.

      Changed

      Line 373, 'through their role as regulators'.

      Changed

      Line 402, either use 'HSP70-x' or 'HSP70-X' throughout the text.

      Changed

      Line 540, the speed used to pellet the samples for sorbitol lysis assay, 1600g is quite high and could reflect RBC fragility rather than direct sorbitol induced lysis. The parasitemia is also very low, and previous published methods have used ~90% parasitemia rather than the 2% used here. We are not saying the method is wrong but please check it is accurate.

      We used the method of our former colleague Stefan Baumeister (University of Marburg), who is an expert in analysis of NPP, thus we are sure the method is correct. We are in fact tempted to remove the NPP data as they deflect from the main narrative of the manuscript, this being the reason we include them only as supplementary data

      Line 479, 10µm should be 10 µM.

      Changed

      In Fig 1A, the primers A, B, C etc are not explained anywhere that I can see.

      This information has now been included in the 1A Figure legend and table 2A.

      Figure 1B, I do not see any clear band for the 3' integration indicated with the *. Can a better image be shown?

      We apologise. Integration PCRs are notoriously challenging. Any revised manuscript will include better quality images

      It seems from Fig 3G,H,I that the KAHRP puncta are bigger in ∆PFA but are as abundant as CS2. Given that KAHRP is associated with knobs how do you reconcile this with there being fewer knobs per unit area in ∆PFA compared to CS2 as in Fig 2B? The numbers of knobs/KAHRP spots/Objects per um2 seems to vary between Fig 2 and 3. Please provide some commentary about this.

      We are not sure if all KAHRP spots actually label eKnobs, and it is possible that there are KAHRP “foci” that are not associated with eKnobs. We also wish to note that the data in figure 2 and 3 were produced using very different techniques. Sample preparation may lead to membrane shrinkage or stretching, and the different microscopy techniques have very different levels of resolution. For this reason we do not believe that the data from these very different independent experiments can be compared, however a comparison within a data set is possible and good practice.

      In the bottom panels of Fig 4, KAHRP::mCherry appears to extend beyond the glycocalyx beyond the cell. Is this an artifact?

      We checked assembly of the figure and are sure that this was not introduced during production of the figure. Our only explanation is that WGA does not directly stain the erythrocyte membrane, but the glycocalyx. A closer examination of the WGA signal reveals that it is weaker at this point (and also in the eKnobs i, ii) so potentially the KAHRP signal is beneath the erythrocyte plasma membrane, but the membrane cannot be visualised at this point.

      Line 837, does this refer to 10 technical replicates or was the experiment repeated on 10 independent occasions? This should at least be done in 2 biological replicates given the range in technical replicates on the graph. Was CS2 considered as '100% lysis' or the water control described in the method? Please provide more detail.


      This figure is the result of 10 biological and 4 technical replicates. A number of data points were removed as lying outside normal distribution (Gubbs test). The highest value within a biological replicate was set to 100% to allow comparison of results. This has now been corrected in the text.

      Reviewer #1 (Significance (Required)):

      This is a reasonably significant publication as it describes knob defects that to my knowledge have never been observed before. Importantly, the deletion of the J domain from PFA66 is genetically complemented to restore function really confirming a role for this protein in knob development. Amino acids critical for the function of the J-domain are also resolved. Apart from some minor technical and wording issues the paper is really nice work apart from one area which is the proposed partnership of PFA66 with human HSP70 for which there is not much direct evidence. If this evidence can be provided, we think this work could be published in a high impact journal. Without the evidence, it could find a home in a mid-level journal with some tempering of the claims of PFA66's interaction with human HSP70.

      **Referee Cross-commenting**


      There seems to be a high degree of similarity in the reviewers' comments and I think as many issues as possible should be addressed. I definitely agree that the term mentula should be not be used.


      We have now adopted the suggestion of Reviewer 3, and use the term eKnobs.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      Plasmodium falciparum exports several proteins that contain J-domains and are hypothesized to act as co-chaperones to support partner HSP70s chaperones in the host erythrocyte, but the function of these co-chaperones is largely unknown. Here the authors provide a functional analysis of one of these exported HSP40 proteins known as PFA66 by using the selection-linked integration approach to generate a truncation mutant lacking the C-terminal substrate binding domain. While there is no fitness cost during in vitro culture, light and electron microscopy analysis of this mutant reveals defects in knob formation that produces a novel, extended knob morphology and ablates Var2CSA-mediated cytoadherence. These knob formation defects are distinct from previous mutants and this unique phenotype is exploited by the authors to show that the HSP70-stimulating "HPD" motif of PFA66 impacts rescue of the altered knob phenotype. In other HSP40 co-chaperones, this motif is critical to stimulate partner HSP70 activity, suggesting that PFA66 acts as a bona fide co-chaperone. Importantly, previous work by the Przyborski lab and others has shown that deletion PfHSP70x, the only HSP70 exported by the parasite, does not phenocopy the PFA66 mutant, implying that the partner HSP70 is of host origin. The results are exciting but I have some concerns about controls needed to properly interpret the functional complementation experiments. My specific comments are below.


      We agree that some control experiments are missing, and these will be included in any future revision.

      **Major comments**

      __

      • The failure of the HPD mutant PFA66 to rescue the knob-defect is very interesting. However, the authors need to determine that the HPA mutant is expressed at the same level as the WT (by quantification against the loading controls in the western blots in Fig 1D and Fig S6H) and is properly exported (by IFA and/or WB on fractionated iRBCs, as done for the GFP-fused truncation in Fig S1A). Otherwise, the failure to rescue is hard to interpret. If these controls were in place, the conclusion that a host HSP70 is likely being hijacked by PFA66 is appropriate. This genetic data would be greatly strengthened by in vitro experiments with recombinant protein showing activation of a host HSP70 by PFA66, but I realize this may be out of the scope of the present study. Along these lines, it might be worth discussing the finding by Daniyan et al 2016 that recombinant PFA66 was found to bind human HSPA1A with similar affinity to PfHSP70x but did not substantially stimulate its ATPase activity, suggesting this is not the relevant host HSP70. This study is cited but the details are not discussed. __

      As in our answer to Reviewer 1, we will examine the expression and localisation of both WT and mutant PFA66.

      We are currently expressing and purifying a number of HSP40/70 combinations for exactly the kind of analysis suggested and hope to include such data in future revisions, but as the reviewer fairly notes, this is really beyond the scope of the current study.

      Regarding Daniyan et al (and other) papers: The fact that PFA66 can stimulate PfHSP70x does not preclude that it also interacts with human HSP/HSC70, and indeed there is some stimulation of human HSP70. Daniyan and colleagues did steady-state assays in the absence of nucleotide exchange factors. Therefore, the stimulation of human HSP/HSC70 is not very prominent. One should either do single-turnover experiments or add a nucleotide exchange factor to make sure that nucleotide exchange does not become rate-limiting for ATP hydrolysis. This is completely independent of the results for PfHSP70-X the intrinsic nucleotide exchange rates of the studied HSP70s could be very different. Also, it is important to understand that J-domain proteins generally do not stimulate ATPase activity much by themselves but in synergism with substrates, allowing the possibility that such an in vitro assay may not reflect the situation in cellula. dditionally the resonance units in the SPR analysis for PFA66-HsHSP70 are lower than those for PFA66-PfHSP70-X. This could mean that PFA66 is a good substrate for PfHSP70-X but not for HsHSP70, but this does not mean that PFA66 does not cooperate with HsHSP70.

      - The authors claim that truncation of PFA66 alters the localization of KAHRP but not the other exported proteins they evaluated by IFA (Fig S4). This seems baseless as they don't apply the same imageJ evaluation to these other proteins. Similarly, the statement that KAHRP structures "appear by eye to have a lower circularity, although we were not able to substantiate this with image analysis" is subjective/qualitative and should probably be removed.

      We mean to say that we could not (in a blinded check), tell the difference between WT and KO IFA images. Only KAHRP (in our opinion) demonstrated a different fluorescence pattern. As KAHRP has previously been implicated in knob formation, we then analysed this phenotype in more detail. A detailed analysis of the fluorescence pattern in the other IFAs does, in our eyes, not add to the story or add any real value to our observations.

      The statement on the circularity has been removed according to the reviewers wishes.

      -The section title "Chelation of membrane cholesterol...causes reversion of the mutant phenotype in ∆PFA" seems an overstatement given the MBCD effect on the knob morphology is fairly weak and remains significantly abnormal.

      The title of this section was misleading, we agree. We have retitled it “Chelation of membrane cholesterol but not actin depolymerisation or glycocalyx degradation causes partial reversion of the mutant phenotype in ∆PFA” to clarify that the reversion was only partial (as explained by the following text in the manuscript).

      **Minor comments**

      - The DNA agarose gel image in Fig 1B is not very convincing. Most of the bands are faint and there is a lot of background/smear signal in the lanes. Also, it would help for clarity if the primer pairs used for each reaction were stated as shown in the diagram (rather than simply "WT", "5' Int" and "3' Int").

      We apologise. Integration PCRs are notoriously challenging. Any revised manuscript will feature clearer images.

      - Given the vulgar connotation of "mentula", the authors might consider an alternative term.

      We have now adopted the term “eKnobs” suggested by Reviewer 3.

      - lines 67-69: The authors may wish to cite a more recent review that takes into account updated Plasmepsin 5 substrate predication from Boddey et al 2013 (PMID: 23387285). For example, Boddey and Cowman 2013 (PMID: 23808341) or de Koning-Ward et al 2016 (PMID: 27374802).

      A fair point, we have now added Koning-Ward.

      - lines 77-79: "deleted" is repetitive in this sentence.

      Changed

      - line 115: It might be clearer to state "endogenous PFA66 promoter"

      Changed

      - lines 131-132: "...these data suggests that deletion of the SBD of PFA66 leads to a non-functional protein." Behl et al 2019 (PMID: 30804381) showed the recombinant C-terminal region of PFA66 (residues 219-386, including the SBD truncated in the present study) binds cholesterol. The authors may wish to mention this along with their reference to Kulzer et al 2010 showing PFA66 segregates with the membrane fraction, suggesting cholesterol is involved in J-dot targeting.

      We should have noted this connection and thank the reviewer for bringing it to our attention. This section has been revised to include this important information.

      - line 198: It's not clear what is meant by "+ve" here and afterward. Please define.

      We have now changed this to “structures labelled by anti-KAHRP antibodies”, or merely “KAHRP”.

      - lines 749-750: "Production of PFA and NEO as separate proteins is ensured with a SKIP peptide". Translation of the 2A peptide does not always cause a skip (see PMID: 24160265) and often yields only about 50% skipped product (for example, PMID: 31164473). Because of the close cropping in the western blots in Fig 1C or S1A this is difficult to assess. Is a larger unskipped product also visible? Beyond this one point, it is general preferable that the blots not be cropped so close.

      A very valid point, and in other parasite lines we have indeed detected non-skipped protein. In our case, we visualise a band at the predicted molecular mass for the skipped dPFAGFP and the commonly observed circa. 26kDa GFP degradation product. The full-length blots have now been included as supplementary data (Figure S7).

      - lines 867-868: Explain more clearly what "Cy3-caused fluorescence" is measuring.

      The Cy3 channel refers to anti-var2CSA staining, and we have now included this information.

      - Several figure legends would benefit from a title sentence describing what the figure is about (ie, Fig legends 1, 3, 5, S1, S5 & S6)

      This has been added.

      Reviewer #2 (Significance (Required)):

      This manuscript by Diehl et al reports on the function of the exported P. falciparum J-domain protein PFA66 in remodeling the infected RBC. Obligate intracellular malaria parasites export effector proteins to subvert the host erythrocyte for their survival. This process results in major renovations to the erythrocyte, including alteration of the host cell cytoskeleton and formation of raised protuberances on the host membrane known as knobs. Knobs serve as platforms for presentation of the variant surface antigen PfEMP1, enabling cytoadherence of the infected RBC to the host vascular endothelium. This process is of great interest as it is critical for parasite survival and severe disease during in vivo infection. The basis for trafficking of exported effectors within the erythrocyte after they are translocated across the vacuolar membrane is not well understood but is known to involve chaperones. This is a particularly interesting study in that it provides evidence in support of the hypothesis, initially proposed nearly 20 years ago, that the parasite hijacks host chaperones to remodel the erythrocyte. This is biologically intriguing and also suggests new therapeutic strategies targeting host factors that would not be subjected to escape mutations in the parasite genome. The work will be of interest to the those studying exported protein trafficking and/or virulence in Plasmodium (such as this reviewer) as well as the broader chaperone and host-pathogen interaction fields.

      **Referee Cross-commenting**

      I also agree with similarity in comments. Some additional discussion on the failure to localize the PFA66 truncation by live FL is warranted, as noted by reviewer #1. Seems likely that either the level of PFA66 protein is reduced by the truncation or the truncated PFA66 is dispersed from J-dots and harder to visual when diffuse instead of punctate. In either case, the complementing copy (WT or QPD) should be visualized by IFA.


      As noted above, we believe our inability to visualize the truncated protein is likely due to its dispersal throughout the whole erythrocyte cytosol as opposed to lower expression levels, but we will be checking this, and also the localisation of WT and mutant PFA66 complementation chimera and expect to have this result for the next revision.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      The data are for the most part well controlled and reveal a potential function for PFA66 in knob formation. The assays are state of the art and the data provides insight into knob formation.

      However, some conclusions are not fully supported by the data. For example, 'uncover a KAHRP-independent mechanism for correct knob biogenesis' (line 52-53) is not supported by the data because PFA66 truncation could result in misfolding of KAHRP and thus lead to knob biogenesis defects.

      We meant to imply that not only perturbations/absence of KAHRP lead to aberrant knobs. This is now changed to “…uncover a new KAHRP-independent molecular factor required for correct knob biogenesis.”.

      The other major issue is that despite having a complemented parasite line in hand, the parental parasite line is used as a control for almost all assays. This is a critical issue because an alternative explanation for their data would be that expression of truncated PFA66 leads to expression of a misfolded protein that aggregates in the host RBC OR it clogs up the export pathway and indirectly leads to knob biogenesis defects. It is surprising that the authors do not test the localization of dPFA using microscopy especially since it is tagged with GFP. While the complemented parasite line does revert back, this could also be due to the fact that the complement overexpresses the chaperone helping mitigate issues caused by the truncated protein.

      As all virulence characteristics we monitor in this study have been verified many times in the parental CS2 parasites in the literature, we think that the best comparative control is indeed the truncated cell line. The large part of our study aimed to characterize differences in various characteristics upon inactivation of PFA66 function, and for this reason we used the parental WT line as a control. Using the complementation line would not truly reflect the effect of PFA66 truncation, as PFA66::HA was not expressed from an endogenous locus, but rather from an episomal plasmid. This itself may result in expression levels which differ from WT, and thus this parasite line cannot be seen as the gold-standard control for assaying PFA66 function.

      We did indeed try to localize dPFA (lines 122-123 in the original manuscript), but were unsuccessful, likely due to diffusion of dPFA throughout the entire erythrocyte cytosol (as opposed to concentration into J-dots as the WT). For this reason we carried out fractionation instead, and could show that dPFA is soluble within the erythrocyte cytosol. This experiment additionally excludes any blockage of the export pathway as no dPFA was associated with the pellet/PV fraction. Other proteins were still exported as normal (Figure S4), further supporting a functional export pathway. Indeed, as reported by ourselves and our colleagues (particularly from the Spielmann laboratory, Mesen-Ramirez et al 2016, Grüring et al 2012), blockage of the export pathway is likely to lead to non-viable parasites as the PTEX translocon seems to be the bottleneck for export of a number of proteins, many of which are essential for parasite survival.

      Reviewer #3 (Significance (Required)):

      The malaria-causing parasite extensively modifies the host red blood cell to convert the host into a suitable habitat for growth as well as to evade the immune response. It does so by exporting several hundred proteins into the host cell. The functions of these proteins remain mostly unknown. One parasite-driven modification, essential for immune evasion, is the assembly of 'knob' like structures on the RBC surface that display the variant antigen PfEMP1. How these knobs are assembled and regulated is unknown.

      In the current manuscript, Diehl et al target an exported parasite chaperone from the Hsp40 family, termed PFA66. The phenotypic observations described in the manuscript are quite spectacular and well characterized. The truncation of PFA66 results in some abnormal knob formation where the knobs are no longer well-spaced and uniform but instead sometimes form tubular structures termed mentulae. The mechanistic underpinnings driving the formation of mentulae remain to be understood but that will probably several more manuscripts to be deciphered.

      We thank the reviewer for their kind comments, and also for the recognition that this current manuscript is merely the exciting beginning of a story!

      **Major Comments:**

      General comment on the use of controls: The large part of our study aimed to characterize differences in various characteristics upon inactivation of PFA66 function, and for this reason we used the parental WT line as a control. Using the complementation line as a control in this context would not truly reflect the effect of PFA66 truncation, as PFA66::HA was not expressed from an endogenous locus, but rather from an episomal plasmid. This itself may result in expression levels which differ from WT, and thus this parasite line cannot be seen as the gold-standard control for assaying PFA66 function. Our complementation experiments were initially designed to verify that phenotypic changes ONLY related to inactivation of PFA66 function and were (as unlikely as this is) not due to second site changes during the genetic manipulation process. To avoid lengthy and not really very informative analysis of the complementation line, we used knob morphology via SEM as a “proof-of-principle”. However, as the reviewer is formally correct, we have added a passage to the discussion stating that “We wish to note that we cannot unequivocally state that our complementation construct caused reversion of all the aberrant phenotypes herein investigated, however we feel it likely that all abnormal phenotypes are linked and thus our “proof of principle” investigation of knob/eKnob phenotypes is likely to be reflected in other facets of host cell modification and can thus be seen as a proxy for such.“.

      Fig 3: The control used here is the parental line. Was there a reason why the complemented parasite line was not used as the control? Showing that the KAHRP localization and distribution is restored upon complementation would greatly increase the confidence in the phenotype.

      Please see our general comments above.

      Fig 5: The data showing a defect in CSA binding are convincing but again only the parental control is used and not the complemented parasite line. The complemented parasite line should be used as a control for the PFA binding mutant.

      Please see our general comments above, and also our reponse to reviewer 1.

      In 5D, the defect in dPFA seems to be occur to a lesser degree than Fig. 2C. How many biological replicates are shown in each of these figures? The figure legend says 20 cells were quantified via IFA but were these cells from one experiment? The expression of mentulae seems quite variable, while the authors mention '22%' (line 164), it seems in most other experiments, its more ~10% (5D and S6B, D-E). Were these experiments blinded?

      As the reviewer is likely aware, subtle differences in parasite culture conditions, stage, fixation, SEM conditions and length of time in culture between time experimental time points can lead to variations in results. Due to the time required to generate the data for figure 5, these experiments took place months after the original (i.e. Figure 2C) analysis. It is not possible to directly compare the results of these two independent experiments, however it is possible to compare the results of the parasite lines included within each set of experimental data. Due to the time and cost involved, each of these experiments represents only one biological replicate. If required, we can include more replicates, although this is more likely to further complicate the situation due to the reasons mentioned above.

      Fig S6G: The staining suggests that most PfEMP1 in is not exported, in any parasite line. Staining for PfEMP1 is technically challenging and these data are not enough to show that expression level is 'similar' (Line 279-280). It may be more feasible to use the anti-ATS antibody and stain for the non-variant part of PfEMP1 (Maier et al 2008, Cell).


      It is well known that a large portion of PfEMP1 remains intracellular. This figure does not aim to differentiate between surface exposed and internal PfEMP1, but merely to show that similar TOTAL PfEMP1 is expressed in the deletion line, and also that the parasites have not undergone a switching event which would lead to loss of CSA binding ability. We will endeavour to address this in future revisions by Western Blot but wish to note that WB analysis of PfEMP1 is notoriously difficult.

      Lines 320-322: The logic of why increased robustness of the RBC membrane would lead to faster parasite growth is confusing. It is likely that the loss of PfEMP1 expression leads to faster growth. The loss of NPP is minimal and may not cause growth defects in rich media.

      As far as we can detect, there is no loss of total PfEMP1 expression (as verified by figure S6G), but rather a drop in surface exposure and functionality, which is unlikely to affect parasite growth rates. What we intended to say was that the NPP assay is influenced by fragility of the erythrocyte, and therefore a stiffer erythrocyte may be more resistant to sorbitol-induced lysis. As the NPP result does not really add much to the main narrative of this manuscript, we would prefer not to invest unnecessary effort for a minimal potential readout. Indeed, we are tempted to remove the NPP data as they deflect from the main findings of the manuscript, this being the reason we include them only as supplementary data

      Lines 433-434: These data do support a function for HsHsp70 but these data are among many others that have previously provided circumstantial evidence for its role in host RBC modification. May be a co-IP would help support these conclusions better.

      Despite all our best efforts and publications, we have been unable to detect this interaction in co-IP or crosslink experiments, although we were successful in detecting interactions between another HSP40 (PFE55) and HsHSP70 (Zhang et al, 2017). Although this is disappointing, it may be explained due to the transient nature of HSP40/HSP70 interactions. We agree that our suggestion (that parasite HSP40s functionally interact with human HSP70) is not novel (we and others have noted this possibility for over 10 years), however the challenging nature of the experimental system makes it very difficult to show direct evidence of the importance of this interaction in cellula. Over the past decade we have use numerous experimental approaches to try to address this but have always been confounded by technical challenges. In 2017 the corresponding author took a sabbatical to attempt manipulation of hemopoietic stem cells to reduce HSP70 levels in erythrocytes, however it appears (unsurprisingly) that HsHSP70 is required for stem cell differentiation, and thus this tactic was not followed further. The authors believe that, due to the lack of the necessary technology, indirect evidence for this important interaction is all that can realistically be achieved at this time, and this current study is the first to provide such evidence.

      We would further like to note that a successful co-IP would not directly verify a functional interaction between PFA66 and HsHSP70, but could also reflect a chaperone:substrate interaction between these proteins, and is therefore not necessarily informative.

      **Minor Comments:**

      Fig1: The bands are hard to see in WT and 3’Int. May be a better resolution figure would help? Also, the schematic shows primers A-D but the figure legend does not refer to them. It would be useful to the reader to have the primers indicated above the PCR gel along with the expected sizes.

      We apologise. Integration PCRs are notoriously challenging. Any revised manuscript will contain clearer images.


      Fig S1: The NPP data could be improved if tested in minimal media. It has been shown that NPP defects do not show up in rich media (Pillai et al 2012, Mol. Pharm. PMID: 22949525). Does complementation restore NPP and growth rate?

      As the NPP result does not really add much to the main narrative of this manuscript, we would prefer not to invest unnecessary effort for a minimal potential readout. Indeed, we are tempted to remove the NPP data as they deflect from the main findings of the manuscript, this being the reason we include them only as supplementary data. Likewise the complementation experiments are, we feel, unnecessary.

      Fig 4: It is not clear what the line scan analysis are supposed to show. What does ‘value’ on the y-axis mean?


      These are line scans of fluorescence intensity (arbitrary units) along the yellow arrows shown on the fluorescent panels. This is now indicated in the figure legend.

      Fig S5D: Maybe it was a problem with the file but no actin staining is visible.

      The actin stain was visible on the screen, but unfortunately not in the PDF. We have applied (suitable) enhancement to produce the images in the new version.

      Fig 6: A model for mentulae formation is not really proposed. Only what the authors expect the mentulae to look like.

      We have changed the legend to reflect this “Figure 6. Proposed model for eKnob formation and structure.”. We do propose that runaway extension of an underlying spiral protein may lead to eKnobs, thus would like to keep the word “formation”.

      Lines 312-313: It is not clear what 'highly viable' means, parasites are either viable or not.


      This has been changed.

      Lines 400-405: The authors forgot to cite a complementary paper that showed no virulence defect upon 70x knockout or knockdown (Cobb et al mSphere 2017). Those data also support a role for HsHsp70.

      We apologise for the omission. This is now included.

      **Referee Cross-commenting**


      I agree, the comments are pretty similar. The authors could tone down their conclusions or add more data to support their conclusions. May be call them elongated knobs or eKnobs, instead of mentula? __

      We have now removed the offending term and use eKnobs.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      The data are for the most part well controlled and reveal a potential function for PFA66 in knob formation. The assays are state of the art and the data provides insight into knob formation.

      However, some conclusions are not fully supported by the data. For example, 'uncover a KAHRP-independent mechanism for correct knob biogenesis' (line 52-53) is not supported by the data because PFA66 truncation could result in misfolding of KAHRP and thus lead to knob biogenesis defects.

      The other major issue is that despite having a complemented parasite line in hand, the parental parasite line is used as a control for almost all assays. This is a critical issue because an alternative explanation for their data would be that expression of truncated PFA66 leads to expression of a misfolded protein that aggregates in the host RBC OR it clogs up the export pathway and indirectly leads to knob biogenesis defects. It is surprising that the authors do not test the localization of dPFA using microscopy especially since it is tagged with GFP. While the complemented parasite line does revert back, this could also be due to the fact that the complement overexpresses the chaperone helping mitigate issues caused by the truncated protein.

      Significance

      The malaria-causing parasite extensively modifies the host red blood cell to convert the host into a suitable habitat for growth as well as to evade the immune response. It does so by exporting several hundred proteins into the host cell. The functions of these proteins remain mostly unknown. One parasite-driven modification, essential for immune evasion, is the assembly of 'knob' like structures on the RBC surface that display the variant antigen PfEMP1. How these knobs are assembled and regulated is unknown.

      In the current manuscript, Diehl et al target an exported parasite chaperone from the Hsp40 family, termed PFA66. The phenotypic observations described in the manuscript are quite spectacular and well characterized. The truncation of PFA66 results in some abnormal knob formation where the knobs are no longer well-spaced and uniform but instead sometimes form tubular structures termed mentulae. The mechanistic underpinnings driving the formation of mentulae remain to be understood but that will probably several more manuscripts to be deciphered.

      Major Comments:

      Fig 3: The control used here is the parental line. Was there a reason why the complemented parasite line was not used as the control? Showing that the KAHRP localization and distribution is restored upon complementation would greatly increase the confidence in the phenotype.

      Fig 5: The data showing a defect in CSA binding are convincing but again only the parental control is used and not the complemented parasite line. The complemented parasite line should be used as a control for the PFA binding mutant. In 5D, the defect in dPFA seems to be occur to a lesser degree than Fig. 2C. How many biological replicates are shown in each of these figures? The figure legend says 20 cells were quantified via IFA but were these cells from one experiement? The expression of mentulae seems quite variable, while the authors mention '22%' (line 164), it seems in most other experiments, its more ~10% (5D and S6B, D-E). Were these experiments blinded?

      Fig S6G: The staining suggests that most PfEMP1 in is not exported, in any parasite line. Staining for PfEMP1 is technically challenging and these data are not enough to show that expression level is 'similar' (Line 279-280). It may be more feasible to use the anti-ATS antibody and stain for the non-variant part of PfEMP1 (Maier et al 2008, Cell).

      Lines 320-322: The logic of why increased robustness of the RBC membrane would lead to faster parasite growth is confusing. It is likely that the loss of PfEMP1 expression leads to faster growth. The loss of NPP is minimal and may not cause growth defects in rich media.

      Lines 433-434: These data do support a function for HsHsp70 but these data are among many others that have previously provided circumstantial evidence for its role in host RBC modification. May be a co-IP would help support these conclusions better.

      Minor Comments:

      Fig1: The bands are hard to see in WT and 3'Int. May be a better resolution figure would help? Also, the schematic shows primers A-D but the figure legend does not refer to them. It would be useful to the reader to have the primers indicated above the PCR gel along with the expected sizes.

      Fig S1: The NPP data could be improved if tested in minimal media. It has been shown that NPP defects do not show up in rich media (Pillai et al 2012, Mol. Pharm. PMID: 22949525). Does complementation restore NPP and growth rate?

      Fig 4: It is not clear what the line scan analysis are supposed to show. What does 'value' on the y-axis mean?

      Fig S5D: Maybe it was a problem with the file but no actin staining is visible.

      Fig 6: A model for mentulae formation is not really proposed. Only what the authors expect the mentulae to look like.

      Lines 312-313: It is not clear what 'highly viable' means, parasites are either viable or not.

      Lines 400-405: The authors forgot to cite a complementary paper that showed no virulence defect upon 70x knockout or knockdown (Cobb et al mSphere 2017). Those data also support a role for HsHsp70.

      Referee Cross-commenting

      I agree, the comments are pretty similar. The authors could tone down their conclusions or add more data to support their conclusions. May be call them elongated knobs or eKnobs, instead of mentula?

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Plasmodium falciparum exports several proteins that contain J-domains and are hypothesized to act as co-chaperones to support partner HSP70s chaperones in the host erythrocyte, but the function of these co-chaperones is largely unknown. Here the authors provide a functional analysis of one of these exported HSP40 proteins known as PFA66 by using the selection-linked integration approach to generate a truncation mutant lacking the C-terminal substrate binding domain. While there is no fitness cost during in vitro culture, light and electron microscopy analysis of this mutant reveals defects in knob formation that produces a novel, extended knob morphology and ablates Var2CSA-mediated cytoadherence. These knob formation defects are distinct from previous mutants and this unique phenotype is exploited by the authors to show that the HSP70-stimulating "HPD" motif of PFA66 impacts rescue of the altered knob phenotype. In other HSP40 co-chaperones, this motif is critical to stimulate partner HSP70 activity, suggesting that PFA66 acts as a bona fide co-chaperone. Importantly, previous work by the Przyborski lab and others has shown that deletion PfHSP70x, the only HSP70 exported by the parasite, does not phenocopy the PFA66 mutant, implying that the partner HSP70 is of host origin. The results are exciting but I have some concerns about controls needed to properly interpret the functional complementation experiments. My specific comments are below.

      Major comments

      • The failure of the HPD mutant PFA66 to rescue the knob-defect is very interesting. However, the authors need to determine that the HPA mutant is expressed at the same level as the WT (by quantification against the loading controls in the western blots in Fig 1D and Fig S6H) and is properly exported (by IFA and/or WB on fractionated iRBCs, as done for the GFP-fused truncation in Fig S1A). Otherwise, the failure to rescue is hard to interpret. If these controls were in place, the conclusion that a host HSP70 is likely being hijacked by PFA66 is appropriate. This genetic data would be greatly strengthened by in vitro experiments with recombinant protein showing activation of a host HSP70 by PFA66, but I realize this may be out of the scope of the present study. Along these lines, it might be worth discussing the finding by Daniyan et al 2016 that recombinant PFA66 was found to bind human HSPA1A with similar affinity to PfHSP70x but did not substantially stimulate its ATPase activity, suggesting this is not the relevant host HSP70. This study is cited but the details are not discussed.
      • The authors claim that truncation of PFA66 alters the localization of KAHRP but not the other exported proteins they evaluated by IFA (Fig S4). This seems baseless as they don't apply the same imageJ evaluation to these other proteins. Similarly, the statement that KAHRP structures "appear by eye to have a lower circularity, although we were not able to substantiate this with image analysis" is subjective/qualitative and should probably be removed.
      • The section title "Chelation of membrane cholesterol...causes reversion of the mutant phenotype in ∆PFA" seems an overstatement given the MBCD effect on the knob morphology is fairly weak and remains significantly abnormal.

      Minor comments

      • The DNA agarose gel image in Fig 1B is not very convincing. Most of the bands are faint and there is a lot of background/smear signal in the lanes. Also, it would help for clarity if the primer pairs used for each reaction were stated as shown in the diagram (rather than simply "WT", "5' Int" and "3' Int").
      • Given the vulgar connotation of "mentula", the authors might consider an alternative term.
      • lines 67-69: The authors may wish to cite a more recent review that takes into account updated Plasmepsin 5 substrate predication from Boddey et al 2013 (PMID: 23387285). For example, Boddey and Cowman 2013 (PMID: 23808341) or de Koning-Ward et al 2016 (PMID: 27374802).
      • lines 77-79: "deleted" is repetitive in this sentence.
      • line 115: It might be clearer to state "endogenous PFA66 promoter"
      • lines 131-132: "...these data suggests that deletion of the SBD of PFA66 leads to a non-functional protein." Behl et al 2019 (PMID: 30804381) showed the recombinant C-terminal region of PFA66 (residues 219-386, including the SBD truncated in the present study) binds cholesterol. The authors may wish to mention this along with their reference to Kulzer et al 2010 showing PFA66 segregates with the membrane fraction, suggesting cholesterol is involved in J-dot targeting.
      • line 198: It's not clear what is meant by "+ve" here and afterward. Please define.
      • lines 749-750: "Production of PFA and NEO as separate proteins is ensured with a SKIP peptide". Translation of the 2A peptide does not always cause a skip (see PMID: 24160265) and often yields only about 50% skipped product (for example, PMID: 31164473). Because of the close cropping in the western blots in Fig 1C or S1A this is difficult to assess. Is a larger unskipped product also visible? Beyond this one point, it is general preferable that the blots not be cropped so close.
      • lines 867-868: Explain more clearly what "Cy3-caused fluorescence" is measuring.
      • Several figure legends would benefit from a title sentence describing what the figure is about (ie, Fig legends 1, 3, 5, S1, S5 & S6)

      Significance

      This manuscript by Diehl et al reports on the function of the exported P. falciparum J-domain protein PFA66 in remodeling the infected RBC. Obligate intracellular malaria parasites export effector proteins to subvert the host erythrocyte for their survival. This process results in major renovations to the erythrocyte, including alteration of the host cell cytoskeleton and formation of raised protuberances on the host membrane known as knobs. Knobs serve as platforms for presentation of the variant surface antigen PfEMP1, enabling cytoadherence of the infected RBC to the host vascular endothelium. This process is of great interest as it is critical for parasite survival and severe disease during in vivo infection. The basis for trafficking of exported effectors within the erythrocyte after they are translocated across the vacuolar membrane is not well understood but is known to involve chaperones. This is a particularly interesting study in that it provides evidence in support of the hypothesis, initially proposed nearly 20 years ago, that the parasite hijacks host chaperones to remodel the erythrocyte. This is biologically intriguing and also suggests new therapeutic strategies targeting host factors that would not be subjected to escape mutations in the parasite genome. The work will be of interest to the those studying exported protein trafficking and/or virulence in Plasmodium (such as this reviewer) as well as the broader chaperone and host-pathogen interaction fields.

      Referee Cross-commenting

      I also agree with similarity in comments. Some additional discussion on the failure to localize the PFA66 truncation by live FL is warranted, as noted by reviewer #1. Seems likely that either the level of PFA66 protein is reduced by the truncation or the truncated PFA66 is dispersed from J-dots and harder to visual when diffuse instead of punctate. In either case, the complementing copy (WT or QPD) should be visualized by IFA.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Review of "Co-chaperone involvement in knob biogenesis implicates host-derived chaperones in malaria virulence." by Diehl et al for Review Commons.

      Major Comments.

      1. In this paper the function of Plasmodium falciparum exported protein PFA66, is investigated by replacing its functionally important dnaJ region with GFP. These modified parasites grew fine but produced elongated knob-like structures, called mentulae, at the surface of the parasites infected RBCs. Knobs are elevated platforms formed by exported parasite proteins at the surface of the infected RBC that are used to display PfEMP1 cytoadherance proteins which help the parasites avoid host immunity. The mentulae still display some PfEMP1 and contain exported proteins such as KAHRP but can no longer facilitate cytoadherence. Complementation of the truncated PFA66 with full length protein restored normal knob morphology however complementation with a non-functional HPD to QPD mutant did not restore normal morphology implying interaction of the PFA66 with a HSP70 possibly of host origin is important for function. While a circumstantial case is made for PFA66 interacting with human HSP70 rather than parasite HSP70-x, is there any direct evidence for this eg, protein binding evidence? I feel that without some additional evidence for a direct interaction between PFA66 and human HSP70 then the paper's title is a little misleading.
      2. Was CSA binding restored upon complementation of ∆PFA with the full-size copy of PFA66?

      Minor Comments

      1. Line 36, NPP should be NPPs if referring to the plural.
      2. Line 37, MC should be MCs if referring to the plural. By the way this acronym is never used in the text, it's always written 'Maurer's clefts'.
      3. Abstract, Line 52-53, could be changed to "uncover a new KAHRP-independent..." as it currently implies (albeit weakly) that that this is the first observation of a KAHRP-independent mechanism for correct knob biogenesis. Maier et al 2008, have previously shown that knock out of PF3D7_1039100 (J-domain exported protein), greatly reduced knob size and knock out of PHISTb protein PF3D7_0424600, resulted in knobless parasites.
      4. In the Abstract it is mentioned that "Our observations open up exciting new avenues for the development of new anti-malarials." This is never really expanded upon in the rest of the paper and so seems like a bit of a throwaway line and could be left out.
      5. Line 59, WHO world malaria report should be cited here since these numbers are from the report not a paper from 2002.
      6. Line 67, Marti et al 2004 should be cited here as its published at the same time as Hiller et al 2004.
      7. Line 76, I suggest using either 'erythrocyte' or 'red blood cell' throughout the text not both.
      8. Line 80, Maier et al 2008 should be referenced here.
      9. Line 87, the authors should cite Birnbaum et al 2017 for the technique used. This is cited immediately after (line 98) in the results section but could be addressed at both points in the text.
      10. Line 123, IFAs and live cell imaging failed to detect the PFA-GFP protein and the author proposes this is due to low expression levels. However, PFA66 is expressed at ~350 FPKM in the ring stage and previous studies from your own group have visualised it using GFP before. Is there another explanation for this such as disruption of the locus here has served to greatly reduce the expression level of the fusion protein?
      11. Line 147, for consistency it would be best to introduce infected red blood cell (iRBC) at the beginning of the main text and use throughout the text instead of switching between 'infected human erythrocyte' and iRBC.
      12. Line 153, Fig S2A does not exist.
      13. Lines 156-158: Different knob morphologies are described with repeated reference to Fig2 and FigS2. Since multiple whole-cell SEM images are displayed in these figures it would be worth adding lettering and/or zoomed-in regions of interest highlighting examples of each aberrant knob type.
      14. Line 178-179, "Although not highly abundant in either sample, the morphology of Maurer's clefts appeared comparable in both samples (data not shown)." Why is the data not shown? Representative images of Maurer's clefts from each line should be included in the supplementary figures or this in-text statement should more clearly justified.
      15. Line 196, indirect immunofluorescence assay (IFA).
      16. Line 201, how was the 'non-significant difference' measured? PHISTc looks quite different by eye. Rephrase the term "significant difference" as localisation of these exported proteins was compared visually rather than quantified. Otherwise, a measure of mean fluorescence intensity could be taken for each protein as a basic comparison between the two lines. In the Figure legend of S4, the term "no drastic difference", is used suggesting this was not quantified. By the way, PHISTc appears different by the represented figure.
      17. Line 213, you now have 3 versions for the word wild type, 'wild type', 'wild-type' and 'WT', best to choose one for consistency.
      18. Line 232, 'tubelike' to 'tube-like'.
      19. Line 279, just use 'IFA', the acronym has already been explained earlier in the text.
      20. Line 319, 'permeation' should be 'permeability'.
      21. Line 353, 'The action of host actin is known' to 'Host actin is known'.
      22. Line 373, 'through their role as regulators'.
      23. Line 402, either use 'HSP70-x' or 'HSP70-X' throughout the text.
      24. Line 540, the speed used to pellet the samples for sorbitol lysis assay, 1600g is quite high and could reflect RBC fragility rather than direct sorbitol induced lysis. The parasitemia is also very low, and previous published methods have used ~90% parasitemia rather than the 2% used here. We are not saying the method is wrong but please check it is accurate.
      25. Line 479, 10µm should be 10 µM.
      26. In Fig 1A, the primers A, B, C etc are not explained anywhere that I can see.
      27. Figure 1B, I do not see any clear band for the 3' integration indicated with the *. Can a better image be shown?
      28. It seems from Fig 3G,H,I that the KAHRP puncta are bigger in ∆PFA but are as abundant as CS2. Given that KAHRP is associated with knobs how do you reconcile this with there being fewer knobs per unit area in ∆PFA compared to CS2 as in Fig 2B? The numbers of knobs/KAHRP spots/Objects per um2 seems to vary between Fig 2 and 3. Please provide some commentary about this.
      29. In the bottom panels of Fig 4, KAHRP::mCherry appears to extend beyond the glycocalyx beyond the cell. Is this an artifact?
      30. Line 837, does this refer to 10 technical replicates or was the experiment repeated on 10 independent occasions? This should at least be done in 2 biological replicates given the range in technical replicates on the graph. Was CS2 considered as '100% lysis' or the water control described in the method? Please provide more detail.

      Significance

      This is a reasonably significant publication as it describes knob defects that to my knowledge have never been observed before. Importantly, the deletion of the J domain from PFA66 is genetically complemented to restore function really confirming a role for this protein in knob development. Amino acids critical for the function of the J-domain are also resolved. Apart from some minor technical and wording issues the paper is really nice work apart from one area which is the proposed partnership of PFA66 with human HSP70 for which there is not much direct evidence. If this evidence can be provided, we think this work could be published in a high impact journal. Without the evidence, it could find a home in a mid-level journal with some tempering of the claims of PFA66's interaction with human HSP70.

      Referee Cross-commenting

      There seems to be a high degree of similarity in the reviewers' comments and I think as many issues as possible should be addressed. I definitely agree that the term mentula should be not be used.

    1. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Bhide and colleagues present an insightful study of how cellular mechanics influences differential cell behaviour during morphogenesis despite apparent genetic homogeneity of the cellular ensembles. They dissect the extensively studied system of mesoderm invagination in Drosophila, focussing on the differences in cell behaviours between the cells in the middle of the infolding tissue and on the periphery that, as far as we know, share a common gene expression profile. They describe sub-cellular dynamics of major effector of apical constriction morphogenesis, the myosin motor distribution, in the invaginating cells and conclude that differences in myosin levels alone cannot account for the observed differences in cell behaviours. In order to understand the cell behaviour inhomogeneity, they turn to biophysical simulation and in an impressively exhaustive manner substantiate the idea that non-linear effects are required for explaining the phenomenon. This theoretical treatment fits well with the notion that the genetic identity of the cells but rather cell-cell mechanical coupling determine the differences in invaginating cell's behaviours. Additionally, the modelling is consistent with the myosin asymmetry and dynamics in the cells whose behaviours is being contrasted. Complementary, and beautifully executed filament-based modelling of microscopic actomyosin contractility further corroborates this view. Finally, the proposed model of non-linear actomyosin contractility dynamics governing the differential cell behaviour across genetically homogenous cellular field, is challenged by two complementary laser ablation and optogenetic experimental approaches. Overall, the results represent convincing evidence that points the tissue mechanics field of Drosophila mesoderm into an interesting new direction and has general implications for the understanding of the interplay between genetic regulation and emergent behaviours of cells operating in mechanically complex multicellular embryonic context.

      The study is meticulously executed, highly quantitative and combines effectively experiment and theory. I have only minor comments that concern in particular the presentation of the results.

      The paper is very dense and the text does not complement well the results presented in the main figures. Many panels in the Figures are not referred to explicitly. Figure elements are referenced out of order both within and across Figures. Sometimes, particularly, in the last two Figures (3 and 4) the reader is left alone to figure out what the data show (with the appropriately terse legends and without the clear narrative in the text, it is an uphill battle for non-specialists). Some key results are hidden in the sea of elements within the Figure 2 that contains the most important, relevant and impressive data. As an example, on line 168 the authors point to panel 2F to demonstrate the asymmetry of myosin distribution in some cells. To the best of my understanding, this phenomenon is actually shown in Fig 2E which is curiously not referenced at all.

      Similarly, Figure 2K and L provide crucial data substantiating much of the conclusions of the paper. It requires a major effort to understand what the graphs mean.

      The following simulation results are quite impressive and would deserve a separate Figure which could provide more space for explaining what the parameter maps actually show. What is for instance plotted on the Y axis as steepness?

      Secondly, I find the overall narrative of the manuscript needing some reorganisation. The main question is set-up extremely well, however in the middle of the manuscript the focus on the connection between cell behaviours and genetic programs is lost. New conclusions on force transmission between cells emerge, however they are not obviously connected with the question posed from the onset and addressed in the discussion section. My impression is that the authors are conservative in their reasoning, however it does compromise the overall message of the story that should ideally focus on one subject. I find the combined evidence presented sufficiently supportive of the model that is beautifully and eloquently presented in the concluding sentence of the paper:

      "This mechanism, which we propose corresponds to the non-linear behaviour predicted by the models, would apply both to central and to lateral cells, with a catastrophic 'flip' being stochastic and rare in central cells, but reproducible in lateral cells because of the temporal and spatial gradient in which contractions occur."

      This may not turn out to be the entire story or even entirely correct, but it is certainly and exciting way of thinking about the problem. I wish that the manuscript would stay more on this subject throughout and provide intermediate conclusions supporting this model as the story develops.

      Few more minor comments:

      Line 36 - typo Line 97 - starting bracket missing Line 126 - data on intensity are presented here. There is also a panel on concentration (Fig 1H). Where is this discussed? Line 132 - panel 2G - disruptive out of sequence reference to a future figure Line 135 - with this regard - please spell out this important conclusion Line 183 - typo Line 210 - insects do not have intermediate filaments Line 238 - please provide a hint of how such global ablations are performed Line 240 - walk us through the Figure, it is too complex to figure it out alone Line 245 - why is the clear hypothesis mentioned above (point 2) rephrased? Line 273 - vague statement

      Significance

      The results represent convincing evidence that points the tissue mechanics field of Drosophila mesoderm into an interesting new direction and has general implications for the understanding of the interplay between genetic regulation and emergent behaviours of cells operating in mechanically complex multicellular embryonic context.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Bhide and colleagues explore the mechanisms of cell expansion in epithelial morphogenesis. During the invagination of the Drosophila mesoderm, cells in the center of the prospective mesoderm constrict under the action of actomyosin pulses, while lateral cells elongate towards the center of the mesodermal placode to accommodate the reduction in apical surface of the central cells. Central and lateral cells display strong similarities in terms of gene expression. How are thus this different behaviors (contraction and expansion) accomplished? The authors found that both central and lateral cells assemble actomyosin networks, although lateral cells do it with a certain delay. Mathematical models of cell constriction across the mesoderm using different strain-stress responses showed that strain-induced cell softening was necessary recapitulate the patterns of constriction and expansion observed in vivo. Furthermore, modelling predicts that cells can stretch until the actin networks yield and break. Laser ablation and optogenetic reduction of contractility in central cells results in a reduction in the apical surface area of lateral cells. An optogenetic increase in contractility in lateral cells caused an increase in apical area in central cells. Together, these data suggest that mechanical cues can override and contribute to sculpting genetically defined morphogenetic domains.

      I propose to address the following points before further considering the manuscript:

      MAJOR

      1. Figure 3: following laser ablation of central cells, lateral cells reduce their apical surface. How do the authors know that this reduction in lateral cell apical surface area is an active process, driven by actomyosin-based contraction, rather than a passive response to the expansion of the wound induced by laser ablation? A similar argument could explain the constriction of lateral cells after optogenetic inhibition of actomyosin networks: the central cells relax, expand and compress the lateral cells. To demonstrate active responses of the lateral cells upon laser ablation and optogenetic manipulations of central cells, at the very least the authors should show the distribution of myosin in the lateral cells that constrict and demonstrate the assembly of contractile networks.
      2. Modelling suggests that actin networks yield and break in lateral cells. Does this occur in vivo?
      3. Lines 166-175: The authors propose that constriction of a cell affects the localization of myosin in its neighbors. However, this is not directly measured. The authors should quantify the relative myosin offset in the cells around constricting cells, and show that that offset is greater (and oriented towards the constricting cell) than in cells around expanding cells. There should be a correlation between the relative size change of a cell and the myosin offset (not just concentration) in their neighbours. In addition, does optogenetic activation of constriction in lateral cells affect the offset of myosin networks in central cells?
      4. Fig. 2E-F: the authors argue that the mean myosin concentration in lateral cells at certain times is equivalent to that of central cells earlier in the invagination process. However, the fraction of apical surface area covered by myosin network is consistently lower for lateral cells (and also for central cells that remain unconstricted!). Have the authors considered this fact, and if not, why? Wouldn't this explain, at least in part, why some cells constrict and others do not, if medial myosin networks drive the disassembly of the apical surface? If myosin activity were increased in laterals cells once central cells begin constricting, would that lead to an increased fraction of lateral cell surfaces covered by actomyosin networks and to reduced lateral cell elongation?

      MINOR

      1. Image panels are missing scale bars in many figures.
      2. Fig. 1C'-D': The authors should include a color bar to provide some indication of the scale of the apical areas measured. Same comment for other figures in which apical area is color-coded.
      3. Supp. Fig. 2E-F, G-H and Supp. Fig. 6: what is the difference between myosin intensity and myosin concentration? Junctional vs medial localization? Or summed vs mean pixel value? Please be specific, the difference between intensity and concentration is not clear.
      4. Line 118: Supp. Fig. 2 does not have panels I and K.
      5. Line 223: the authors reference data at 175 sec, but Supp. Fig. 6 does not show any images at that time point. They should be added or a different time point indicated.

      TYPOS

      1. Abstract: "[in a supracellular context" should be "in a supracellular context".
      2. Line 145: should this be a reference to Supp. Fig. 5 instead of Supp. Fig. 4?
      3. Line 166: I am not sure how Supp. Fig. 5 supports this statement. Is this the right figure reference? Should it be Supp. Fig. 4 instead?
      4. Line 881: "representing on line" should be "representing one line".

      OPTIONAL

      Tony Harris' lab showed that the Arf-GEF Steppke antagonizes myosin and facilitates cell deformation at the leading edge of the embryonic epidermis during Drosophila dorsal closure (West et al., Curr Biol, 2017). Does Steppke localize to junctions in lateral but not central mesoderm cells? Does the pattern of Steppke localization in the mesoderm change with manipulations to the contractility of central cells?

      Significance

      This is an interesting study, and one that makes uses of beautiful tools, including quantitative microscopy and image analysis, mathematical modeling and optogenetic manipulations. The prediction that embryonic cells display non-linear stress-strain responses is exciting, as linearity has been the predominant assumption so far. However, I find that model predictions are not well supported by the data, and that alternative interpretations of some results are possible. Additionally, the paper lacks insight into the molecular mechanisms that facilitate stretching (although that could be the subject of a follow-up study).

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Summary:

      In this study, the authors explore potential mechanisms for why some cell constrict while other cells expand, despite similar intrinsic genetic programs, during Drosophila ventral furrow formation at the onset of gastrulation. The authors combine quantitative analyses of cell shapes and myosin levels from multiphoton confocal and Multi-View SPIM imaging, optogenetic and laser perturbation experiments, and mechanical models to argue that nonlinear mechanical interactions between cells are required to explain the cell behaviors. Based on microscopic models of the actomyosin cytoskeleton in the tissue the authors argue that the required nonlinear mechanical behavior is consistent with actomyosin network reorganization.

      Major comments:

      • Although the area of investigation is exciting and the results are interesting, unfortunately the quality of the results and comparison between experiment and modeling in the current version of the manuscript are not convincing. Although it is not clearly explained in the manuscript, the experimental results on cell shapes, myosin intensity, laser manipulation, optogenetic perturbations appear to be from a single embryo or small number of embryos for each experiment (Figures 1, 3, 4). The authors state that the cell stretching pattern "was best recapitulated by a superelastic response", but did not provide direct quantitative comparisons of the different mechanical models to the experimental data to clearly demonstrate this. Moreover, the local optogenetic myosin recruitment experiments in Figure 4 do not provide sufficient information on optogenetic tool recruitment, myosin localization, or cell behaviors to justify the claim that the central cells are not activated by the optogenetic perturbation and are only responding to the forces from neighboring cells.
      • The authors should provide direct quantitative comparisons of the models and experiments to clearly demonstrate their claims that the superelastic model is better than the linear model or other nonlinear models.
      • The authors should do additional experiments and/or provide more details for the existing experiments (to include several embryos per condition) on myosin quantification, photo-manipulation, and optogenetics experiments. Additional controls would like be necessary for claims resulting from the optogenetics experiments in Figure 4.
      • The additional time and resources required to address these concerns would depend on the experimental details, N values, and statistics in the current studies, which unfortunately were not described in the current manuscript.
      • Methods descriptions for reproducibility are generally adequate, with the exception of N values and statistics - see above.
      • Are the experiments adequately replicated and statistical analysis adequate? No, see above.

      Minor comments:

      1) Scale bars for images are missing throughout.

      2) Number of embryos and cells analyzed missing throughout text and figure legends.

      3) Units are missing for many quantities in figures and tables throughout.

      4) Many figure references in the main text are incorrect, pointing either to the wrong figure or wrong figure panel.

      5) Line 728. What time point was used for myosin concentrations used in the model? How might myosin dynamics influence these findings?

      6) The authors show a few examples of myosin pulsing in lateral cells and then conclude that myosin pulsing is not qualitatively different from central cells (lines 135-136). The author should quantify the number of pulsing lateral cells as well as period and amplitude of pulsing, or discuss relevant results from prior studies in more detail to justify this conclusion.

      7) Lines 145-150. The authors very briefly describe the results of the linear-stress strain response and conclude this did not yield outputs corresponding to in vivo data and leave this largely to the supplementary figures. This is a key point in the paper and deserves much more discussion and space in the main text. As mentioned in main comments above, a quantitative comparison of the different mechanical models to show that the superelastic model better describes the observations should be included (potentially as an inset to Fig 2D showing a quantitative measure of the quality of model fit to the data).

      8) Lines 162-163. Provide more rationale for why strain-softening would most likely manifest as permanent or reversible cytoskeletal reorganization.

      9) Lines 187-188. "This shows that forces acting on each cell from its neighbors have an important role in determining the cell's behavior." This seems somewhat obvious; perhaps a bit more explanation would help the reader to understand the importance of these results.

      10) Lines 196-198. How were the concentrations and lengths of F-actin chosen? How were the concentration and properties of linkers chosen? How sensitive are the results to these details of the cytoskeletal composition?

      11) Lines 238-244. It would be helpful to include some additional quantification that clearly shows the reader the differences in cell behaviors in control and perturbed tissue. For the optogenetics experiment, it would be important to show quantification that the lateral cells are not being directly perturbed during photoactivation of neighboring cells (e.g. due to light leakage). In both perturbations, it would be helpful to quantify how many cells in rows 7 and 8 constricted and by how much did they constrict? How reproducible were these effects?

      12) Lines 245-252. A key assumption in interpreting this experiment seems to be that the central cells are not directly perturbed by the optogenetic activation. Additional quantifications of RhoGEF2-CRY2 and/or myosin should be shown to support this. It would be helpful to include some additional quantification that clearly shows the reader the differences in cell behaviors in control and experimental regions. How reproducible were these effects?

      13) A section on statistics is missing from the methods section.

      14) Line 615. Ensure that Eq. 1 is dimensionally consistent; crucially, what units are used for 'M'? If the model is non-dimensionalized, provide the reference scales.

      15) Line 675: The investigated stress-strain relationships are presented in Table S1. What are the definitions of xpl and xsh?

      16) Line 678: Parameter values for the stress-strain relationships are given in Table S2. Can you provide more information on how these values were selected and their units? How sensitive are the results to changes in these values? Provide references when possible.

      17) Line 697. Please comment on why the embryo appears skewed to the right.

      18) Line 712. A color-bar corresponding to this color-code is missing in the figure.

      19) Lines 715-717. It seems panels E and E' are swapped in the legend.

      20) Line 724 (Fig 2). It is difficult to read anything in panel K inset or Panel L inset.

      21) Line 728. What does "embryo 1" refer to?

      22) Line 732. A quantitative measure of the quality of the fits of the models to the experimental data should be included.

      23) Line 739. What exactly does "Embryo 2" refer to?

      24) Line 779. Why is a z-plane of 15 microns below surface chosen?

      25) Line 797. Why is a z-plane of 25 microns below the surface chosen?

      26) Line 900. Panel G in Supp Fig 5 is not described in figure description.

      • Are prior studies referenced appropriately? Yes.
      • Are the text and figures clear and accurate? No (see details listed above).
      • It would be very helpful to the reader to show direct quantitative comparison of the different mechanical models with the experimental observations to show how much better the nonlinear model is compared to the linear model. An extended explanation of experiments and experimental results within the main text would improve the manuscript.

      Significance

      The key advance in this work is in identifying a potential role of nonlinear mechanical properties in contributing to distinct cell behaviors within a tissue during development in vivo. This contributes to a growing body of work highlighting the importance of cell and tissue mechanical properties in regulating cell behaviors during the formation of tissue structure.

      This work adds to a growing body of work connecting actomyosin contractility in cells to tissue-scale behavior during development. This work provides a unique mechanical modeling perspective to the study of apical constriction during Drosophila ventral furrow invagination, highlighting a potential role for superelastic cell mechanical behaviors during morphogenesis in vivo.

      The finding would be of interest to researchers working in the areas of morphogenesis, mechanobiology, the cytoskeleton, and active matter.

      This reviewer's expertise is in experimental studies of the cytoskeleton and cell mechanics during morphogenesis.

    4. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1 (Evidence, reproducibility and clarity):

      Bhide and colleagues present an insightful study of how cellular mechanics influences differential cell behaviour during morphogenesis despite apparent genetic homogeneity of the cellular ensembles. They dissect the extensively studied system of mesoderm invagination in Drosophila, focussing on the differences in cell behaviours between the cells in the middle of the infolding tissue and on the periphery that, as far as we know, share a common gene expression profile. They describe sub-cellular dynamics of major effector of apical constriction morphogenesis, the myosin motor distribution, in the invaginating cells and conclude that differences in myosin levels alone cannot account for the observed differences in cell behaviours. In order to understand the cell behaviour inhomogeneity, they turn to biophysical simulation and in an impressively exhaustive manner substantiate the idea that non-linear effects are required for explaining the phenomenon. This theoretical treatment fits well with the notion that the genetic identity of the cells but rather cell-cell mechanical coupling determine the differences in invaginating cell's behaviours. Additionally, the modelling is consistent with the myosin asymmetry and dynamics in the cells whose behaviours is being contrasted. Complementary, and beautifully executed filament-based modelling of microscopic actomyosin contractility further corroborates this view. Finally, the proposed model of non-linear actomyosin contractility dynamics governing the differential cell behaviour across genetically homogenous cellular field, is challenged by two complementary laser ablation and optogenetic experimental approaches. Overall, the results represent convincing evidence that points the tissue mechanics field of Drosophila mesoderm into an interesting new direction and has general implications for the understanding of the interplay between genetic regulation and emergent behaviours of cells operating in mechanically complex multicellular embryonic context. The study is meticulously executed, highly quantitative and combines effectively experiment and theory. I have only minor comments that concern in particular the presentation of the results.

      The paper is very dense and the text does not complement well the results presented in the main figures. Many panels in the Figures are not referred to explicitly. Figure elements are referenced out of order both within and across Figures. Sometimes, particularly, in the last two Figures (3 and 4) the reader is left alone to figure out what the data show (with the appropriately terse legends and without the clear narrative in the text, it is an uphill battle for non-specialists). Some key results are hidden in the sea of elements within the Figure 2 that contains the most important, relevant and impressive data.

      We have split this figure in two, moved some of the results from Suppl. Fig. 5 into one of its parts and included new calculations and data. We have also extended the description of these results in the main text and in the figure legends.

      As an example, on line the authors point to panel 2F to demonstrate the asymmetry of myosin distribution in some cells. To the best of my understanding, this phenomenon is actually shown in Fig 2E which is curiously not referenced at all.

      We have corrected the references to the panels

      Similarly, Figure 2K and L provide crucial data substantiating much of the conclusions of the paper. It requires a major effort to understand what the graphs mean. The following simulation results are quite impressive and would deserve a separate Figure which could provide more space for explaining what the parameter maps actually show. What is for instance plotted on the Y axis as steepness?

      We have added the following explanation: “The ‘width’ of the profile is the number of cells with maximum value; the ‘steepness’ is the slope between minimal and maximal values (equation 2 in materials and methods).”

      Secondly, I find the overall narrative of the manuscript needing some reorganisation. The main question is set-up extremely well, however in the middle of the manuscript the focus on the connection between cell behaviours and genetic programs is lost. New conclusions on force transmission between cells emerge, however they are not obviously connected with the question posed from the onset and addressed in the discussion section.

      To us, the section on force transmission seemed like an important component of the issue of intrinsic versus extrinsically determined cell behaviours. We had seen that the intrinsic programme of the cells, as reflected in their myosin levels, might not be sufficient to explain the difference between stretching and constricting. If their behaviour is not intrinsically determined, then there must be something acting from the outside, and we are looking here at what that might be, i.e. we need to find out how the potential constriction is influenced. The first model tests under which conditions differential contractility leads to different ‘cell’ behaviours. This in turn leads directly to the question of the forces the cells in the epithelium exert on each other.

      My impression is that the authors are conservative in their reasoning, however it does compromise the overall message of the story that should ideally focus on one subject. I find the combined evidence presented sufficiently supportive of the model that is beautifully and eloquently presented in the concluding sentence of the paper:

      "This mechanism, which we propose corresponds to the non-linear behaviour predicted by the models, would apply both to central and to lateral cells, with a catastrophic 'flip' being stochastic and rare in central cells, but reproducible in lateral cells because of the temporal and spatial gradient in which contractions occur."

      This may not turn out to be the entire story or even entirely correct, but it is certainly and exciting way of thinking about the problem. I wish that the manuscript would stay more on this subject throughout and provide intermediate conclusions supporting this model as the story develops.

      Few more minor comments:

      We have corrected all of the typos, mistakes and omissions and adapted the text, as mentioned below.

      Line 36 - typo > Line 97 - starting bracket missing > Line 126 - data on intensity are presented here. There is also a panel on concentration (Fig 1H). Where is this discussed?

      An explanation (definition) has been added to the main text.

      Line 132 - panel 2G - disruptive out of sequence reference to a future figure > Line 135 - with this regard - please spell out this important conclusion

      We have expanded this part, basically introducing the conclusion more clearly (we hope).

      Line 183 - typo > Line 210 - insects do not have intermediate filaments

      We have added ‘mammalian‘ to the reported experiment in the text, to make it clear that this does not refer to Drosophila cells

      Line 238 - please provide a hint of how such global ablations are performed > We have added this – both explicitly, and the relevant references.

      Line 240 - walk us through the Figure, it is too complex to figure it out alone > We have added a more extensive explanation both in the text and in the new figure legend.

      Line 245 - why is the clear hypothesis mentioned above (point 2) rephrased? > Line 273 - vague statement

      We have changed the text in response to these useful pointers.

      **Significance:

      The results represent convincing evidence that points the tissue mechanics field of Drosophila mesoderm into an interesting new direction and has general implications for the understanding of the interplay between genetic regulation and emergent behaviours of cells operating in mechanically complex multicellular embryonic context.

      Reviewer #2

      Bhide and colleagues explore the mechanisms of cell expansion in epithelial morphogenesis. During the invagination of the Drosophila mesoderm, cells in the center of the prospective mesoderm constrict under the action of actomyosin pulses, while lateral cells elongate towards the center of the mesodermal placode to accommodate the reduction in apical surface of the central cells. Central and lateral cells display strong similarities in terms of gene expression. How are thus this different behaviors (contraction and expansion) accomplished? The authors found that both central and lateral cells assemble actomyosin networks, although lateral cells do it with a certain delay. Mathematical models of cell constriction across the mesoderm using different strain-stress responses showed that strain-induced cell softening was necessary recapitulate the patterns of constriction and expansion observed in vivo. Furthermore, modelling predicts that cells can stretch until the actin networks yield and break. Laser ablation and optogenetic reduction of contractility in central cells results in a reduction in the apical surface area of lateral cells. An optogenetic increase in contractility in lateral cells caused an increase in apical area in central cells. Together, these data suggest that mechanical cues can override and contribute to sculpting genetically defined morphogenetic domains.

      I propose to address the following points before further considering the manuscript:

      Major

      1. Figure 3: following laser ablation of central cells, lateral cells reduce their apical surface. How do the authors know that this reduction in lateral cell apical surface area is an active process, driven by actomyosin-based contraction, rather than a passive response to the expansion of the wound induced by laser ablation?

        A similar argument could explain the constriction of lateral cells after optogenetic inhibition of actomyosin networks: the central cells relax, expand and compress the lateral cells.

      With regard to the comparison to wounds, it is important to note that the epithelium is not actually wounded by either ablation method. Thus, while the treatments ablate the actyomyosin meshwork, they do not ablate or kill the cells. Perhaps the term is an unfortunate choice, since it is more commonly used in developmental biology for killing cells. However, here the cells remain intact and when the optogenetic or laser treatment is released the cells resume their physiological activities.

      We have added a note in the text and now refer to ‘laser microdissection’, a term of art in the field, for more clarity.

      Regarding the more important question of what is the active process, expansion of the central cells or constriction of the lateral cells, a contribution from expanding central cells is of course in theory not impossible.

      However, for this scenario to work, in the absence of pulling from the lateral cells, there would have to be a force that is generated in the central cells, in this case a pushing force that would expand the cells and act on the lateral cells. We have shown in our previous work that if the actomyosin is dissected in dorsal cells, which are not surrounded by potentially contractile cells, the cells do not expand (Rauzi et al, 2017). This shows that ‘relaxing’ by itself does not have ‘expansion’ as a consequence. One would therefore have to consider how such a pushing force could arise in these cells. We can think of only two possibilities: hydrostatic pressure or an active force from the subcellular molecular machinery.

      Considering hydrostatic pressure, if the apical actomyosin that is ablated was responsible for maintaining such a pressure inside the cell (a reasonable assumption), then releasing the actomyosin would allow the cell volume to push against the neighbouring cell. However, such a recoil would occur on a very short time scale (seconds), whereas we see the contraction of the lateral cells continuing over extended periods (minutes).

      Alternatively, expansive forces could be generated by the cytoskeleton. Cytoskeletal pushing forces can come from microtubules (classical example: mitotic spindle; epithelial morphogenesis: work from T. Harris and B. Baum labs: PMID 18508861 and 20647372), or from continuous creation of new cross-linked or branching actin networks pushing against plasma membranes, as in the leading edge of crawling cells. But the microtubules in the blastoderm cells are not oriented in such a way they could provide a force in the correct dimension in these cells (the majority is oriented along the apical-basal axis). In addition, the connection between MT and the plasma membrane depends on the cortical actin meshwork (involving, for example, the actin-binding proteins P120-Catenin or patronin/Shot; Roeper lab, PMID 24914560, StJohnston Lab, PMID: 27404359) but the connection of actin with the plasma membrane has been severed in the optogenetically manipulated cells.

      By contrast, we show that normal lateral mesodermal cells possess a contractile actin network. So the only sustained force generated in the system at this point is the contractile force in lateral cells (which is normally counteracted by the stronger contractile force from central cells).

      Thus, we conclude that the expansion of central cells is a passive response to a contractile force from lateral cells, not an active process and conversely, the constriction of lateral cells is an active autonomous process.

      To demonstrate active responses of the lateral cells upon laser ablation and optogenetic manipulations of central cells, at the very least the authors should show the distribution of myosin in the lateral cells that constrict and demonstrate the assembly of contractile networks.

      We have now included the requested data for the experiments with laser ablations. Suppl. Fig. 8 and Suppl. video 3 show the myosin that accumulates in lateral cells. It would be nice also to be able to show this for the optogenetic experiments. However, despite trying hard, we have not succeeded in generating healthy embryos that carry the entire set of transgenes that are necessary to carry out the optogenetic experiments and at the same time visualize myosin (see also response to referee 2, point 3).

      1. Modelling suggests that actin networks yield and break in lateral cells. Does this occur in vivo?

      We postulate that the skewed and inhomogeneous distribution of myosin and the large myosin-free areas in stretched cells (lines 170 – 172 in the original text) are indications of a yielding meshwork, or at least of uneven force distribution in the network that leads to ineffective contraction or even release – i.e. functionally correspond to yielding. We have made this more explicit now.

      We have also added an additional panel quantifying more clearly the proportion of low- myosin areas in lateral cells (now Fig. 3H).

      Work from the Lecuit lab has recently shown beautifully that it is the connectivity of the myosin mesh rather than the underlying actin meshwork that affects apical forces in epithelial cells (PMID: 32483386), and our own findings are entirely consistent with that.

      1. Lines 166-175: The authors propose that constriction of a cell affects the localization of myosin in its neighbors. However, this is not directly measured. The authors should quantify the relative myosin offset in the cells around constricting cells, and show that that offset is greater (and oriented towards the constricting cell) than in cells around expanding cells. There should be a correlation between the relative size change of a cell and the myosin offset (not just concentration) in their neighbours. We now provide measurements of the rate of cell area change against the offset of surrounding myosin (the distance of myosin from a cellular border). We see that surrounding myosin is closer to the border of constricting cells and tends to be further away from the borders of expanding cells.

      We have added these data to the new Fig. 3I.

      In addition, does optogenetic activation of constriction in lateral cells affect the offset of myosin networks in central cells?

      This is technically challenging. For such an experiment we would need an embryo to express membrane and myosin markers in addition to the two optogenetic constructs and the GAL4 driver. We tried multiple times to generate such a cross, but obtained either no embryos or, at best, deformed embryos. We also tried to use the MCP-MS2 system in parallel to CRY2-RhoGEF2 but the crosses had the same problem. This sensitivity to additional genetic load was also observed in the DeRenzis lab, who generated these strains and tested and used them extensively.

      1. Fig. 2E-F: the authors argue that the mean myosin concentration in lateral cells at certain times is equivalent to that of central cells earlier in the invagination process. However, the fraction of apical surface area covered by myosin network is consistently lower for lateral cells (and also for central cells that remain unconstricted!). Have the authors considered this fact, and if not, why? Wouldn't this explain, at least in part, why some cells constrict and others do not, if medial myosin networks drive the disassembly of the apical surface?

      We believe in fact that this is precisely part of the picture and it was what we had meant to propose, but the text was perhaps indeed just to condensed. Thus, we had stated in line of the original document:

      “While the asymmetry is visible in all cell rows, there are larger areas without myosin and the distance of displacement is greater in lateral cells (Fig. 2G-J)”,

      and in the discussion (line 277 – 285):

      “Despite the homogeneous actin meshwork in stretching cells, the areas that are free of active myosin occupy a large proportion of the apical surface – similar to ectodermal or amnioserosa cells in which the connection of pulsatile foci to the underlying actin meshwork is lost. ... Dilution of cortical myosin may compromise a cell’s ability to make sufficient physical connections, in particular along the dorso-ventral axis, so that even if sufficient force is generated, it cannot shorten the cell in the long dimension. In other words, even though the cells have enough myosin to create force, the system is not properly engaged and its force is not transmitted to the cell boundary.”

      However, we didn’t state this with sufficient clarity in the results section and have added an extra sentence to this effect.

      If myosin activity were increased in laterals cells once central cells begin constricting, would that lead to an increased fraction of lateral cell surfaces covered by actomyosin networks and to reduced lateral cell elongation?

      This is a really nice experiment, and we have indeed tried to induce activation at later time points, but unfortunately this did not yield unambiguous results. If we did the manipulation after the central cells had clearly constricted, then activating lateral cells did not lead to their contraction. However, since this is a negative result and we have no independent criterion for knowing how 'strong' the induced contraction was (as explained above, we are unfortunately not able to visualize the myosin in these experiments), and why it might not have been sufficient to overcome the pull from central cells.

      In this context it is worth remembering that in mutants in which myosin is overactivated as a result of defective upstream signalling, lateral cells stretch less or not at all. See PMID: 24026125 for gprk2 mutants and our own results for active Rho1:

      {{images cannot be displayed}}

      Figure: Confocal Z-section of embryos expressing sqh::GFP (myosin; green) and GAP43::mCherry (membrane; magenta) imaged ventrally. A constitutively active form of Rho1 is ectopically expressed using a maternal Gal4 driver, inducing activation of myosin in more lateral cells. White dots mark the mesectoderm determined by backtracing after ventral furrow invagination. Yellow arrows in B are constricted cells in row 7/8.

      Minor

      1. Image panels are missing scale bars in many figures. > 2. Fig. 1C'-D': The authors should include a color bar to provide some indication of the scale of the apical areas measured. Same comment for other figures in which apical area is color-coded.

      We have added the missing elements

      1. Supp. Fig. 2E-F, G-H and Supp. Fig. 6: what is the difference between myosin intensity and myosin concentration? Junctional vs medial localization? Or summed vs mean pixel value? Please be specific, the difference between intensity and concentration is not clear.

      In the cases where we talk about myosin ‘amount’ we have now exchanged the term ‘intensity’, i.e the physical term for the amount of light, for ‘amount’ (i.e. that for which we use the light intensity as a proxy) and have explained in the main text how we define total apical myosin amount and apical myosin concentration (amount over area). However, in the cases where we are describing the actual image analysis, as in Suppl. Fig. 3, we use ‘intensity’ as the term of art that is used for the methods employed here. Similarly, the terms ‘sum intensity’ and ‘mean intensity’ are terms used for image in analysis in Fiji.

      The definitions of “junctional” and “medial” actin were introduced by the Lecuit lab (PMID: 21068726), and we have included the appropriate reference.

      1. Line 118: Supp. Fig. 2 does not have panels I and K. > 5. Line 223: the authors reference data at sec, but Supp. Fig. 6 does not show any images at that time point. They should be added or a different time point indicated.

      These errors have been corrected.

      Typos

      1. Abstract: "[in a supracellular context" should be "in a supracellular context". > 2. Line 145: should this be a reference to Supp. Fig. 5 instead of Supp. Fig. 4? > 3. Line 166: I am not sure how Supp. Fig. 5 supports this statement. Is this the right figure reference? Should it be Supp. Fig. 4 instead? > 4. Line 881: "representing on line" should be "representing one line".

      These errors have been corrected.

      Optional

      Tony Harris' lab showed that the Arf-GEF Steppke antagonizes myosin and facilitates cell deformation at the leading edge of the embryonic epidermis during Drosophila dorsal closure (West et al., Curr Biol, 2017). Does Steppke localize to junctions in lateral but not central mesoderm cells? Does the pattern of Steppke localization in the mesoderm change with manipulations to the contractility of central cells?

      This is certainly interesting, and we have ordered the protein trap, UAS constructs and RNAi lines. However, these will be long-term and time-consuming experiments.

      Significance:

      This is an interesting study, and one that makes uses of beautiful tools, including quantitative microscopy and image analysis, mathematical modeling and optogenetic manipulations. The prediction that embryonic cells display non-linear stress-strain responses is exciting, as linearity has been the predominant assumption so far. However, I find that model predictions are not well supported by the data, and that alternative interpretations of some results are possible. Additionally, the paper lacks insight into the molecular mechanisms that facilitate stretching (although that could be the subject of a follow-up study).

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      Summary:

      In this study, the authors explore potential mechanisms for why some cell constrict while other cells expand, despite similar intrinsic genetic programs, during Drosophila ventral furrow formation at the onset of gastrulation. The authors combine quantitative analyses of cell shapes and myosin levels from multiphoton confocal and Multi-View SPIM imaging, optogenetic and laser perturbation experiments, and mechanical models to argue that nonlinear mechanical interactions between cells are required to explain the cell behaviors. Based on microscopic models of the actomyosin cytoskeleton in the tissue the authors argue that the required nonlinear mechanical behavior is consistent with actomyosin network reorganization.

      Major comments:

      • Although the area of investigation is exciting and the results are interesting, unfortunately the quality of the results and comparison between experiment and modeling in the current version of the manuscript are not convincing. Although it is not clearly explained in the manuscript, the experimental results on cell shapes, myosin intensity, laser manipulation, optogenetic perturbations appear to be from a single embryo or small number of embryos for each experiment (Figures 1, 3, 4).

      We had analysed a much larger number of embryos, but only included those for presentation that provided the most extensive data. It is extremely difficult to obtain absolutely ‘perfect’ embryos at high resolution for full quantification over long periods. ‘Perfect’ means that the embryos are mounted in such a way that they are imaged from an angle of 45 degrees off the dorso-ventral axis, so that initially mesodermal rows 3 to 7 are seen, and then, as furrow formation progresses, the more lateral rows move through the field of vision. It is difficult to mount in this perfect manner for two reasons: the shape of the embryo means that the embryo does not ‘like’ to be balanced in this position, but instead prefers to fall back on its side. Secondly, the embryo has to be mounted at a time point before visible differentiation along the D-V axis, so no visual cues exist to get the positioning right. This means that many of our recordings lack either the more ventral or the lateral cell rows. While the findings for these more restricted observations are fully consistent with our reports, they cannot be quantified with a full comparison across all cell rows over the entire imaging period. Nevertheless, we have processed and analysed further examples which we have now included in Suppl. Fig. 2 and Suppl. Fig. 8.

      The authors state that the cell stretching pattern "was best recapitulated by a superelastic response", but did not provide direct quantitative comparisons of the different mechanical models to the experimental data to clearly demonstrate this.

      Data that illustrate this were shown in Suppl. Fig 5 – but, admittedly, were not well explained, or rather, not at all. We have now added better explanations, expanded the figure, included new analyses, and now present some of these data in the new Fig. 2. Briefly, the figure shows that superelastic and elastoplastic responses are the only curves that successfully reproduce the pattern of stretching lateral cells (last 3 cells stretching with the inner cell stretching most and the last cell stretching least) while at the same time matching the ratio between the cell sizes of the most stretching cells to the least stretching cell.

      The top row of the parameter scans in Suppl. Fig. 5 (now Fig. 2) shows how many cells stretch for each combination of myosin curve steepness (y-axis) and width (x-axis) with shades of blue indicating the number of cells, and the red outline in the field where 3 cells stretch outlining those conditions where the inner cell stretches most. The bottom row shows the resulting size ratios of largest to smallest cell. High ratios in the region outlined in red in the top row are only reached for the superelastic and elastoplastic responses, with the elastomeric tending in the right direction.

      We have now also quantified a goodness-of-fit (root mean squared error, RMSE) measurement between our experimental data and the simulated data of all our models. This is shown now in the new Fig. 2.[1]

      We also note that only the parameter maps of the superelastic and elastoplastic models (Fig. 2J,K) resemble the equivalent parameter maps of the microscopic model (Fig. 3Q).

      Moreover, the local optogenetic myosin recruitment experiments in Figure 4 do not provide sufficient information on optogenetic tool recruitment,

      We have included images that illustrate the optogenetic construct in the illuminated cells, but not in the central cells in Suppl. Fig. 8. It is impossible to show the construct in the ‘dark’ cells, because illuminating them would activate the construct.

      myosin localization,

      As explained above, this is unfortunately technically not feasible. The best we can do is refer to the description of the construct by Izquierdo et al. (PMID: 29915285), which shows the accuracy of the tool and the highly specific membrane recruitment of myosin.

      or cell behaviors

      We have added quantitative comparisons between the experimental and control areas. to justify the claim that the central cells are not activated by the optogenetic perturbation and are only responding to the forces from neighboring cells.

      • The authors should provide direct quantitative comparisons of the models and experiments to clearly demonstrate their claims that the superelastic model is better than the linear model or other nonlinear models.

      See response above.

      • The authors should do additional experiments and/or provide more details for the existing experiments (to include several embryos per condition) on myosin quantification, photo-manipulation, and optogenetics experiments.

      We have provided data for more embryos for all cases.

      Additional controls would like be necessary for claims resulting from the optogenetics experiments in Figure 4.

      This has been addressed above – we have provided additional data and controls.

      • The additional time and resources required to address these concerns would depend on the experimental details, N values, and statistics in the current studies, which unfortunately were not described in the current manuscript.

      We have been able to add substantial additional data and have added the requested numbers. For many of the experiments each recording can be very time consuming and for the reasons explained in this response, it is not always easy to obtain precisely the desired recording from the desired imaging angle with the manipulations having been done precisely in the desired position. The numbers of embryos are therefore not high, but multiple shorter recordings provide a body of results that support the findings, but are not easily comparable statistically.

      • Methods descriptions for reproducibility are generally adequate, with the exception of N values and statistics

      See above.

      • Are the experiments adequately replicated and statistical analysis adequate?

      No, see above.

      Minor comments:

      1) Scale bars for images are missing throughout.

      We have added these

      2) Number of embryos and cells analyzed missing throughout text and figure legends. We have added additional embryos for all conditions and have included the numbers of cells analysed for all quantifications (except in cases where each data point represents a cell).

      3) Units are missing for many quantities in figures and tables throughout.

      We have added these

      4) Many figure references in the main text are incorrect, pointing either to the wrong figure or wrong figure panel.

      These have been corrected

      5) Line 728. What time point was used for myosin concentrations used in the model?

      We have added this information to the figure legend.

      How might myosin dynamics influence these findings?

      As regards the subcellular dynamics of myosin, these are included in the microscopic model (see ref Belmonte et al.;PMID: 28954810). Preliminary results showed that small changes in myosin stall force and unloaded myosin speed have little effect in our general results. This is now shown in a new supplemental figure (Suppl. Fig. 6). However, if the referee is referring to the dynamics of myosin accumulation over time, this is an interesting question.

      We had begun to explore this topic, but then realized for the linear stress-strain model that it is in fact expected that myosin accumulation would ultimately not affect the outcome. This is because in a linear model the final state of the system is determined by the final shape of the governing myosin profile regardless of the time evolution of the profile, and our simulations confirm this. A systematic analysis for all other stress- strain curves with temporal changes in myosin profiles (where a dependency on the profile temporal evolution is expected) is very time-consuming and will be interesting to pursue in future.

      The main conclusion here that linear models do not recapitulate the observed data as well as the non-linear ones stands regardless of how the temporal dynamics of myosin accumulation may affect the non-linear systems.

      6) The authors show a few examples of myosin pulsing in lateral cells and then conclude that myosin pulsing is not qualitatively different from central cells (lines 135- 136). The author should quantify the number of pulsing lateral cells as well as period and amplitude of pulsing, or discuss relevant results from prior studies in more detail to justify this conclusion.

      By ‘not qualitatively different’ we had meant only ‘in the sense that they are capable of generating contractile forces’, and we have made that more explicit in the text now. The quantitative differences have already been analysed and reported by the Martin lab (https://doi.org/10.1101/2020.04.15.043893; the pulses are slower and less persistent), and our point was that in spite of these known differences, the pulses are able to mediate constriction.

      7) Lines 145-150. The authors very briefly describe the results of the linear-stress strain response and conclude this did not yield outputs corresponding to in vivo data and leave this largely to the supplementary figures. This is a key point in the paper and deserves much more discussion and space in the main text.

      We have included a more extensive description and interpretation of the results in the main text, as detailed in several responses above

      As mentioned in main comments above, a quantitative comparison of the different mechanical models to show that the superelastic model better describes the observations should be included (potentially as an inset to Fig 2D showing a quantitative measure of the quality of model fit to the data).

      These comparisons have now been expanded and explained more extensively and moved to the main Figures.

      8) Lines 162-163. Provide more rationale for why strain-softening would most likely manifest as permanent or reversible cytoskeletal reorganization.

      The only component of the cell that can likely mediate this physical property and also respond at the observed time scales is the cytoskeleton. In these cells it is the main mechanical determinant. Other components that could in principle contribute to the nonlinearity of stress-strain response might be the viscosity of the cytosol, or the plasma membrane. However, stress responses of fluids to shear are usually in the direction of increasing stiffness, and rarely, if ever, with shear thinning. The same is mostly true for colloidal solutions. Therefore it is more likely that the stress-strain relationships at the apical surface of the cells are dominated by the dynamics of the actin cytoskeleton given that even the shape of the plasma membrane is in general determined by the cytoskeleton. We have added a note to this effect in the text.

      9) Lines 187-188. "This shows that forces acting on each cell from its neighbors have an important role in determining the cell's behavior." This seems somewhat obvious; perhaps a bit more explanation would help the reader to understand the importance of these results.

      We have expanded the explanations of these findings and added a sentence to relate them to the main model of the paper

      10) Lines 196-198. How were the concentrations and lengths of F-actin chosen? How were the concentration and properties of linkers chosen?

      The parameters were chosen on the basis of our earlier studies on simulated contractile meshworks and the theory underlying their behaviour. We had reported the conditions under which such meshes are able to contract, and also shown that the underlying theory correctly predicts behaviour of experimental meshworks (for those few conditions for which they have been reported).

      Unfortunately, there are practically no measurements for the length of F-actin filaments in vivo and estimates vary widely. Reliable data on the density of the cortical network are equally sparse.

      Based on our own previous work we chose concentrations of cross-linkers, myosin motors and transmembrane connectors that are able to ensure optimal contraction and force. Our in vivo measurements reported here show that the amounts of F-actin do not vary significantly across the mesoderm, so we used the same concentration of actin, crosslinkers and membrane connectors in all cells of the model, varying only myosin concentration. Taking into account the cell diameter of the mesodermal cells (~7um) and to ensure that the meshwork is sufficiently cross-connected (dense) to generate contraction and transmit forces between cells we used a model where each cell contains F-actin filaments of 1.5 um.

      We have expanded our supplemental material to make these points clearer.

      How sensitive are the results to these details of the cytoskeletal composition?

      We varied both the amounts of cytoskeletal components and the parameters controlling their dynamics (such as myosin stall force and viscosity) and found little impact on model predictions. These data are now presented in Suppl Fig. 6.

      11) Lines 238-244. It would be helpful to include some additional quantification that clearly shows the reader the differences in cell behaviors in control and perturbed tissue.

      We have added quantitative comparisons of the cells in the perturbed region with cells in an equivalent control region, together with evaluations of two additional embryos.

      For the optogenetics experiment, it would be important to show quantification that the lateral cells are not being directly perturbed during photoactivation of neighboring cells (e.g. due to light leakage).

      We have included this information, as described above.

      In both perturbations, it would be helpful to quantify how many cells in rows 7 and 8 constricted and by how much did they constrict? How reproducible were these effects?

      The perturbation experiments were those where it was most difficult to obtain a large number of identical-looking embryos that would allow broad statistics to be applied. For this to work, we would have to have embryos that were identically mounted and illuminated in the identical area of precisely rows 1 to 6 on each side of the midline – at a resolution of one cell row of 6.2 um width. And all this blind, because at the start of the manipulation there are no visual cues for orientation. Morphology gives no cues at this stage. The MS2-MCP-GFP works for laser ablations, but cannot be used for the optogenetics, because the embryo must not be exposed to blue light. This means we cannot predetermine precisely which rows we target.

      We have however added data and quantifications for the control and two further laser- manipulated embryos, which are now shown in suppl. Fig. 8. It is evident from both that our perturbations were slightly asymmetric and included the outer rows on only one side and on that side several cells that would normally have stretched are now strongly constricted. While by no means true for all lateral cells, this is a case of one black swan disproving the hypothesis that all swans are white: any constricting cell within two cell diameters of the mesectoderm, i.e. ones that would normally stretch proves that lateral cells do have the capacity to constrict.

      12) Lines 245-252. A key assumption in interpreting this experiment seems to be that the central cells are not directly perturbed by the optogenetic activation. Additional quantifications of RhoGEF2-CRY2 and/or myosin should be shown to support this.

      We have included an image of the optogenetically activated construct in this experiment in Fig. 5, but we cannot show its behaviour in the non-activated part because if we illuminated it, it would be activated. We were unable to create the embryos necessary to document the behaviour of myosin.

      It would be helpful to include some additional quantification that clearly shows the reader the differences in cell behaviors in control and experimental regions. How reproducible were these effects?

      We now provide the results from two additional embryos in Suppl. Fig. 8, and include quantitative comparisons between the control and experimental regions for these and for the embryos that are currently shown in Fig. 5 E.

      13) A section on statistics is missing from the methods section.

      We have added descriptions of the quantifications and statistics.

      14) Line 615. Ensure that Eq. 1 is dimensionally consistent; crucially, what units are used for 'M'? If the model is non-dimensionalized, provide the reference scales.

      Apart from the initial distance between membrane positions (set to 6.2 um) all other units in our visco-elastic model are arbitrary. In order to make this clearer, instead of using the term “viscosity” in equation 1, we now call it a “damping constant”.

      15) Line 675: The investigated stress-strain relationships are presented in Table S1. What are the definitions of xpl and xsh?

      We have included these definitions in materials and methods:

      All stress-strain curves are linear for extensive strains (∆𝑥) lower than the proportionality limit (𝑥!"), with some curves (elastoplastic and superelastic) undergoing a strain-softening to strain-hardening change after a given strain-hardening limit (𝑥#$).

      16) Line 678: Parameter values for the stress-strain relationships are given in Table S2. Can you provide more information on how these values were selected and their units? How sensitive are the results to changes in these values? Provide references when possible.

      The values for xpl and xsh were chosen to be within the range of the observed lengths of stretching cells, with xpl < xsh. Changing the values of each parameter listed in Table S2 does change the results quantitatively, but over the ranges we tested them, never to the point of making the linear or the other non-linear models reproduce the target pattern of stretching.

      We have stated this in the materials and methods section.

      17) Line 697. Please comment on why the embryo appears skewed to the right. Embryos are not always ‘perfect’, unfortunately. In addition, they can get slightly squashed during mounting and imaging. In spite of its imperfection, we showed this particular one, because we had imaging data for a long period without drift or other interference, and with good contrast at great depth.

      18) Line 712. A color-bar corresponding to this color-code is missing in the figure.

      This has been corrected.

      19) Lines 715-717. It seems panels E and E' are swapped in the legend.

      corrected

      20) Line 724 (Fig 2). It is difficult to read anything in panel K inset or Panel L inset.

      We have rearranged this figure and replaced some panels for greater clarity, and to remove redundancy.

      21) Line 728. What does "embryo 1" refer to?

      This was a remainder from an old plan where each embryo was numbered and listed in a table so that it could be cross-referred to. We have now described in the supplementary table the genotypes and imaging technique for each group of embryos. Where we show data or analyses of the same embryo in different figures, we refer directly to the relevant panels. We have made sure the embryos are referred to correctly in the figure legends.

      22) Line 732. A quantitative measure of the quality of the fits of the models to the experimental data should be included.

      We have done this, and the new data are now included in the new Figure 2.

      23) Line 739. What exactly does "Embryo 2" refer to?

      See comment 21

      24) Line 779. Why is a z-plane of 15 microns below surface chosen? > 25) Line 797. Why is a z-plane of 25 microns below the surface chosen?

      The planes were chosen in each case to show the reader in one single plane rows 7 and 8 along with the central cells > 26) Line 900. Panel G in Supp Fig 5 is not described in figure description.

      The panel captions were wrongly numbered. This has now been corrected, and more information on this figure has been included in the text. > - Are prior studies referenced appropriately?

      Yes.

      • Are the text and figures clear and accurate?

      No (see details listed above).

      • It would be very helpful to the reader to show direct quantitative comparison of the different mechanical models with the experimental observations to show how much better the nonlinear model is compared to the linear model.

      We have included this.

      An extended explanation of experiments and experimental results within the main text would improve the manuscript.

      We have expanded our explanations in many places.

      Significance:

      The key advance in this work is in identifying a potential role of nonlinear mechanical properties in contributing to distinct cell behaviors within a tissue during development in vivo. This contributes to a growing body of work highlighting the importance of cell and tissue mechanical properties in regulating cell behaviors during the formation of tissue structure.

      This work adds to a growing body of work connecting actomyosin contractility in cells to tissue-scale behavior during development. This work provides a unique mechanical modeling perspective to the study of apical constriction during Drosophila ventral furrow invagination, highlighting a potential role for superelastic cell mechanical behaviors during morphogenesis in vivo.

      The finding would be of interest to researchers working in the areas of morphogenesis, mechanobiology, the cytoskeleton, and active matter.

      This reviewer's expertise is in experimental studies of the cytoskeleton and cell mechanics during morphogenesis.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer 1:

      I think the experiments within the manuscript are generally of good quality and well controlled.

      We would like to thank the reviewer for the appreciation of our work.

      ...However, I find that the authors' conclusions are very often not supported by the experiments performed (as detailed below) and I would strongly recommend that the authors stick to the conclusions that can be drawn based on the data they have generated. In my opinion, this manuscript contains findings that are of interest to the field but it needs to be rewritten with more justifiable conclusions.

      We have extensively rewritten the manuscript and toned down the role of the HMR/LHR complex in hybrids while emphasizing its role in Drosophila melanogaster.

      1) 'Speciation Core Complex' - The only link to speciation is the fact that the 'SCC' includes D.melanogaster HMR, a known hybrid incompatibility gene. On the other hand, all of these proteins have important functions in a pure species context and all of the interactions reported between the members of the SCC occur in a D.melanogaster background. Also, SCC assembly in viable/inviable hybrids is not tested. Essentially, I would come up with a different and more functionally consistent name for the complex. I highly recommend against naming these stable interactors as the 'SCC' unless the authors can show that mutating any of the other 'SCC' proteins (specifically NLP, NPH, BOH1 & BOH2), which should presumably also disrupt SCC formation, leads to the rescue of hybrid male lethality?

      We agree with the reviewer that we base the naming of the complex on the presence of the products of the two known hybrid incompatibility genes Hmr and Lhr. As we did not investigate the complex’ composition in hybrids we agree with the three reviewers that the term SCC is probably misleading. We also agree with the reviewer that it would be highly interesting to investigate whether NLP, NPH, BOH1 or BOH2 mutations also rescue hybrid male lethality. However, we would need to generate fly lines carrying mutations in both the D.mel and the D.sim alleles since the respective genes are autosomal and we feel that this would be beyond the scope of the manuscript. Moreover, such assays would only be possible it those genes are non-essential and not like Nlp, of which the available hypomorphic or deletion alleles are homozygous lethal (**Padeken, J. et al. (2013)**).

      2) Is it a stable 6-membered complex? - The only line of evidence for the presence of a stable complex between all 6 proteins are the MS data from Figure 1C and Figure S1A-C. Although I don't think it is necessarily required, a biochemical demonstration that these proteins co-sediment at a high MW would be a much stronger indication of complex formation. That being said, I think the authors can use their expertise in AP-LC/MS to more comprehensively characterize complex formation.

      Besides the fact that we observe all six components in AP-MS experiment using either one of the subunits, we have also shown in our previous experiments (Thomae et al, 2013) that all subunits can be purified by a tandem purification using first an antibody against FLAG-HMR followed by a Myc-LHR antibody. We also tried to purify the HMR complex via size exclusion chromatography to determine the size of the complex as suggested by the reviewer. Unfortunately, we did not manage to isolate enough of the complex in a soluble form that allowed us to detect a single peak on a size exclusion column. This may be either due to a disassembly of the complex during the unavoidable dilution during SEC or a lack of antibody sensitivity. We also tried to reconstitute the entire complex from recombinantly expressed proteins but failed to express all subunits in a soluble form. It is worth mentioning that a similar observation has been made, for example, for the Dosage Compensation Complex, which, despite being well characterized, has also eluded a characterization using size exclusion chromatography.

      a) For example, the authors could test whether loss of BOH1/BOH2 in S2 cells impacts complex formation. A reduction of interactions between other complex members would strengthen the authors' conclusion of a stable and stoichiometric 6-membered complex.

      Based on our observation that HMR and LHR form a stable heterodimeric complex in vitro (Figure S4) we assume that the presence or absence of the other components does not affect the complex composition in its entirety. The experiment suggested by the reviewer would allow us to distinguish between direct and indirect interactions between BOH1/2 and HMR. Though this is clearly a very exciting approach, RNAi mediated knock downs are rarely complete in S2 cells, making such experiments difficult to interpret. Therefore, these experiments would need to be supported by reconstitution of the different complexes in vitro and potentially crosslinking MS experiments. Such extensive molecular analysis would very likely require at least 6 month to be completed and would be beyond the scope of the current manuscript.

      1. b) Additionally, I would suggest that they use one (or more) of BOH1/BOH2/NLP/LHR as baits in the S2 cells expressing HMR mutations (HMR2 and HMR DC, Figure 3) to test complex formation. Beyond Figs. 1 and S1, the authors only test one-way interactions between HMR (or HMR mutants) and the other 5 binding partners. It is unclear if the other 5 'SCC' members are capable of binding each other when HMR is mutated. As a result, how HMR affects the ability of other proteins to interact with each other and its role in complex formation remains somewhat unclear. This is particularly important since the authors conclude in the discussion that "HMR acts as a molecular bridge between different modules of the SCC" and that "the integrity of the SCC is essential for its function".

      Similar to our answer to the reviewer’s suggestion above, we believe that this experiment requires an additional extensive molecular analysis to be meaningful, which is beyond the scope of the current manuscript. It is important to clarify here that the S2 cells we use still express endogenous full length-HMR, which could participate in complex formation even when Hmr mutant alleles are expressed. To unambiguously show that BOH1 and BOH2 still interact with the other complex components when they no longer associate with HMR, we would therefore need to generate a CRISPR based exchange of all HMR genes in SL2 cells with a mutated version of HMR and analyze their interaction partners. As both alleles fail to fully rescue HMR functionality in a deletion background and as we have shown previously that a removal of HMR results in mitotic defects, it may not even be possible to generate such cell lines.

      3) Centromeric vs heterochromatic localization of HMR - There appears to be some differences between Hmr localization across different tissues as the authors have noted in their introduction. In this manuscript, the authors assess HMR localization in S2 cells as well as mitotic and endocycling follicle cells from various stages of oogenesis. In these cell types, the authors compare HMR localization to both Cenp-C (centromere) and HP1 (constitutive heterochromatin). In my opinion, it is not easy to get a clear perspective on what the authors consider to be HMR's true localization in these cells and tissues. I would recommend the following straightforward changes/experiments related to this point,

      a) Label the image categories in Figure 4A. Please also describe in detail the classification criteria were used to separate these image categories from one another.

      In the revised manuscript we will label the image categories in Figure 4A. An extensive description on how the classification criteria were applied can be found in the methods section.

      b) I would also move Figure S7A to the main text since it demonstrates centromeric colocalization of HMR in early follicle cells.

      In the revised manuscript we will move **figure S7A to a new figure 5C. We have furthermore investigated the localization of endogenous HMR in various cell types in ovaries, which is going to be included in the revised manuscript as a new figure 5A.

      c) Use linescans on existing images to better demonstrate colocalization between Hmr and Cenp-C and/or HP1

      In the revised manuscript we will prepare linescans/profile plots for all IF pictures when necessary.

      d) Show Cenp-A and HMR staining for the images in Figure 5C and stage 10 follicle cells from Figure S7A.

      As stainings with the Cenp-C antibody resulted in more stable and reproducible signals, we used Cenp-C as a proxy for Cenp-A and centromere localization. In Figure S7A and B we stained Cenp-C and showed a greatly reduced expression in follicle cells undergoing endoreplication. We therefore did not perform a Cenp-C (or Cenp-A)/HMR co-staining in these cells and do not think it would add to a better understanding of the mechanisms of HMR locaization (Figure 5C).

      e) I feel the authors do not spend enough time discussing the fact that HMRDC still appears to localize to centromeres at most follicle cells upto Stage 7.

      We now also include the staining of endogenous HMR (figure 5A revised ms) in the various cell types in ovaries. This allows us to expand the discussion of HMR’s localization in dependency of the cell type and stage. These studies not only reveal the high diversity of HMR localization but also suggests that the potential of HMR to localize to the centromere as well as pericentromeric heterochromatin is crucial for its function. In the revised manuscript we have now discussed the fact that HMRdC still localizes to the centromere up to stage 7 more extensively.

      In sum, it would also be nice for the authors to take a clear position on whether HMR is centromeric, heterochromatic or both in the cells they analyze by microscopy and why these localizations may change between the cells they have looked at.

      The fact that we now include a novel figure where we investigate HMR’s localization in different cell types allows us to discuss the (diverse) localization as well as its potential regulation more extensively. As the localization is highly dependent on the cell type observed as well as the cell cycle stage use, we feel that these aspects need to be taken into account when describing HMRs localization. This is now discussed in the revised manuscript.

      4) HMR2 analyses - I think HMR2 is an important mutant to include as a control for HMRDC, especially since the authors should already have the required strains/data. I specifically mean the following,

      1. a) Figure 4C - Please add HMR2 ChIP-seq tracks only if the authors already have this data.

      Unfortunately, we were unable to acquire convincing HMR2-ChIP data. This may be due to the fact that HMR2localizes quite diffusely or due to a lower percentage of cells expressing this allele in the S2 lines used. Both issues do not influence our interpretations in AP-MS experiments or in single cell based fluorescence microscopy assays, but is problematic in bulk cell population assays like ChIP. Therefore, we cannot provide good HMR2 ChIP-Seq tracks.

      b) Figure 5C and Figure S7B - Add HMR2 IF images. Please also discuss HMR2 localization to centromeres and heterochromatin.

      In the revised manuscript, we have/will attache(d) IF images of ovarial tissue made from strains heterozygous for the Hmr2 allele. Due to the lower gene dosage the intensity of HMR stainings is reduced making a precise localization more difficult. As the manuscript mainly focusses on the description of the newly discovered HmrdC allele, we have added this as supplemental material.

      c) Figure 5E - Increase n's for the HMR2 fertility assay.

      The HMR2 allele has been extensively characterized by Aruna and colleagues (Aruna et al., Genetics (2009)) with regards to its effect on fertility. For this particular assay we only use it as a positive control and reference for the newly described HMRdC allele. We therefore feel that an increase in the number of replicates would be redundant to the earlier publications.

      5) HMR localization in female germline cells - Given that the authors indicate that female fertility and telomeric transposon suppression are compromised with HMR2 and HMRDC, I think it would strengthen the manuscript to address HMR localization with respect to heterochromatin and centromeres in the nurse cells and/or oocytes.

      We now also include the staining of endogenous HMR (figure 5A revised ms) in nurse cells, oocytes and early-stage follicle cells. This allows us to expand the discussion of HMR’s localization in dependency of the cell type and stage.

      6) I find the last part of the abstract and discussion i.e. HMR bridges heterochromatin and the centromere, to be very speculative based on the data presented. As far as I can tell, the only experimental basis for this conclusion is the fact that HMR binds known centromeric and heterochromatic proteins. With this logic, you could easily make a similar argument for the numerous proteins that colocalize with centromeric and pericentromeric heterochromatin. Personally, I would not speculate extensively on a HMR bridging activity without more compelling functional readouts.

      Our hypothesis of HMR as a bridging factor between centromeric and pericentromeric heterochromatin is not only based on its colocalization and interaction with components of chromatin types but also on our previous findings that an HMR knockdown results in a moderate centromere declustering and studies using super-resolution microscopy, which indicate that HMR is sandwiched between the two components (Kochanova, N. Y. et al. (2020)). As the proteomic analysis of the two HMR alleles presented in this study suggest that interactions with both components are required for full functionality of HMR, we assume that it bridges between the two chromatin components. However, we agree with the reviewer that this could also be explained by a centromeric as well as a heterochromatic function of HMR, which are independent from each other. We therefore removed the hypothesis from the abstract and discussed it together with other potential explanations for our findings.

      **Minor comments:**

      1) Intersection plot - I would explain the intersection plot on Figure 1C more thoroughly (I found it confusing).

      We expanded the paragraph in which we explain the intersection plot in figure 1C.

      2) Image colours - The images in Figure S2 and Figure S7 are hard to interpret due to the colours used for the HA and Hmr channel respectively. I would use the white pseudo-colour for DAPI and omit this channel from the merged image and insets (a line demarcating the nucleus would suffice in the merged image). In addition, a linescan would better represent colocalizations or lack thereof.

      We will omit the DAPI channel from the merged images and used a line to demarcate the nucleus as suggested by the reviewer in the revised manuscript. To better illustrate co-localisation of distinct factors we will used line profile plots.

      3) I'm not convinced that one can determine stoichiometry and sub-stoichiometry of protein complexes based on spectral counts; spectral counts could be affected by other factors. Therefore, I would hesitate to use "However, HP1a is only present in sub-stoichiometric amounts in the AP-MS purifications with antibodies against the SCC...."

      The question of whether the stoichiometry of complexes using iBAQ values of purified protein complexes is intensely discussed in the field. Several studies do suggest that this can indeed be done (i.e. Wohlgemuth, Iet al. Proteomics 15, 862–879 (2015); Smits, A. H., Nucleic Acids Research 41, e28–e28 (2012)), which is why we commented on the lower intensity of HP1a relative to the other subunits of the complex. However, we agree with the reviewer that this can only be an approximation rather than a precise measurement (which would need a full in vitro reconstitution, see comments above). We have mentioned this in the revised manuscript.

      4) Ambiguity in description of methods - In the methods section 'Crosses for generating Hmr genotypes for hybrid viability assays', the authors state that "In the rescue experiment, Hmr+ served as a positive (lethality rescue) and Hmr2 as a negative control (no lethality rescue)". The authors might consider rewording this as I think it's a bit strange to refer to hybrid male lethality as a rescued state.

      We agree with the reviewer that the wording to describe the assay we used to investigate HMR’s function in male hybrids is counterintuitive as a “rescue of functionality” results in male hybrid lethality. To better describe it we now call the assay “hybrid viability suppression”, according to the nomenclature that has been used by Aruna et al, 2009 (Aruna, S. et al. Genetics (2009)).

      .

      Reviewer #1 (Significance (Required)):

      **Nature and Significance of the advance:**

      This work adds to the study of reproductive isolation in Drosophila by defining a stable set of molecular interactors of the HMR hybrid incompatibility protein. In my opinion, this study offers a platform for future research into the poorly understood molecular events that trigger hybrid incompatibility in Drosophila. In addition, the authors generate a novel HMR mutation (HMRDC) that also rescues hybrid male lethality and it would be interesting to determine in finer detail how closely this mutation mimics other known HMR mutations. A characterization of BOH1/BOH2 would have also significantly strengthened the manuscript.

      We would like to thank the reviewer for the appreciation of our work. We agree with the reviewer that a deeper characterization of BOH1/BOH2 will further unravel their role in the complex. However, our initial experiments using null alleles or knock downs of BOH1 and BOH2 in D.mel showed no effect or only minor effects on transposon activation and hybrid male lethality. This is most probably due to the fact that the D.sim alleles can fully complement for their function. Moreover, the recombinant expression of BOH1 and 2 turned out to be difficult due to problems in protein solubility. We therefore need to postpone our BOH1 and 2 studies to a later timepoint.

      **My Expertise:**

      Satellite DNA repeats, Chromocenters, Speciation, Hybrid Incompatibility

      **Referees cross-commenting"

      I also agree that all the reviewer comments are reasonable. The manuscript would be significantly improved by making conclusions that can be supported by the data. I think some additional experiments are also warranted to make the paper more robust.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      In this study, the authors identify a protein complex that contains hybrid incompatibility genes Hmr and Lhr, naming it SCC (Speciation Core Complex). This paper's major conclusions are: 1) overexpression of Hmr (which resembles the situation in hybrid, where hmr/lhr are overexpressed) results in ectopic protein-protein interaction. 2) Hmr's DNA binding domain (mutated in Hmr2) and C-terminal domain (known to interact with Lhr) are important for its function and in causing hybrid lethality.

      The identification of SCC complex is quite intriguing, but this paper does not cover much of functional significance of this complex at all. For example, does mutating other components of SCC complex (BOH1 etc) rescue hybrid lethality? Without examining these important issues, they instead drifted to study the domain function of Hmr. It is not so clear why these two lines of studies are glued together in one paper.

      It is not that I insist that the authors have to do all these experiments, but the assembly of the paper makes this paper quite inconclusive. After reading it, the readers are left behind wondering what is the function of SCC---and we do not even know whether 'speciation core complex' is a fair naming, without any knowledge whether any of the components being involved in speciation or not.

      Overall, this work contains a lot of important information, which promises future breakthrough on the subject matter. However, unfortunately, the study is not carried out to generate any conclusion and is fairly incomplete at this point.

      We thank the reviewer for his appreciation of the importance of our work and apologize that we did not clarify the reasoning of the experiments sufficiently. We think that part of the reviewer’s disappointment is due the fact that we named the complex speciation core complex (SCC), which was indeed an unfortunate decision as we are unable to investigate the complex in male hybrids where it exerts it’s function in mediating hybrid incompatibility (see also answer to comments of reviewer 1). We therefore changed the name to HMR complex and tried to better explain the rational of our experiments in the text.

      **Specific comments.**

      • Quality of Fig4A is too low. I cannot even tell where is the boundary of nucleus. Diffuse signal in category 'yellow' and 'grey'---are they entire cell or nucleus or nucleolus? Please add additional marker(s) for better interpretation of the Hmr signal presented.

      We have improved the quality of figure 4A by adding lines to indicate the nuclear boundary and inserting profile plots to better illustrate the different types of co-localisation.

      • In Fig4A and 5C, the localization of Hmr (wild type version) looks quite different in these two images. Which image is more 'representative' for Hmr localization? (as they build the logic on Hmr localization, this inconsistency is quite bothering). This might be cell-type-specific issue, but if so, how do we know the relevance of their localization? These issues make the result of localization analysis of wt/mutant Hmr inconclusive.

      After reading the reviewers responses we realized that we did not describe our findings well enough, which resulted in a major confusion about the localization of HMR in cells. Indeed, the localization of HMR differs widely depending on the cell type used. We have now included a new figure (new Figure 5A) illustrating the analysis of the endogenous HMR localization in ovaries isolated from D.mel. We hope that the additional figure together with our interpretation helps to alleviate the confusion and adds to the understanding of HMR’s function and potential evolution of HMR.

      Reviewer #2 (Significance (Required)):

      Hmr and Lhr are known as 'hybrid incompatibility genes', deletion of which rescues male hybrid lethality in Drosophila melanogaster/simulans hybrid crosses. Understanding the molecular function of Hmr and Lhr is expected to provide insights into the fundamental question of how two species become incompatible (i.e. how speciation occurs). This study investigates the protein complex that contains Lhr and Hmr, identifying a previously unidentified 'core' complex. Understanding the function of this complex may significantly advance our understanding of speciation.

      **Referees cross-commenting"

      I think all review comments are reasonable. However, I'd like to emphasize that the biggest issue with this paper is not about the data, but how the authors frame it. The term such as 'speciation core complex' is beyond 'hype' (not even 'exaggeration'). Simply there is no evidence that this term can be supported. I think the authors need to be more ethical. I would be surprised if authors truly believe they can claim that the term 'speciation core complex' is justifiable in science.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      **Summary:**

      The manuscript "The integrity of the speciation core complex is necessary for centromeric binding and reproductive isolation in Drosophila" by Lukacs and colleagues describes a study that show, by mass-spec and ChIP-seq, that two well established hybrid incompatibility proteins form a 6-protein complex that predominantly localizes near HP1a bound chromatin boundaries. With a C-terminal domain of HMR deleted, the 6-protein core complex was not disrupted, but its interaction and subsequent localization to HP1a domain near centromeres was lost. In addition, an HMR double mutant that disrupts the interaction between HMR and other components of the 6-protein core complex was tested and similar distribution patterns as for the dC mutant were observed. Next, the nuclear localization was HMR was tested in fruit fly follicle cells by IF. In endoreplicating cells, HMR-dC did not colocalize with HP1a, as did the double mutant. The expression level of several transposable elements (TEs) was assessed and only the full length wt Hmr transgene was able to rescue the repression of TEs, whereas neither the dC and double mutants did. When the number of offspring was assayed, a similar pattern was observed. Finally, male hybrid lethality was assayed by crossing D melanogaster mothers with different Hmr alleles with wt D simulans and only the wt Hmr allele resulted in male lethality, whereas both cD and double mutants resulted in 10-40% of the offspring to be male. These findings led the authors to conclude that 1) 6-protein speciation core complex containing HMR, LHR, NLP, NPH, and two uncharacterized proteins called BOH1 and BOH2, 2) overexpression of HMR/LHR results in novel interactions with other chromatin factors, 3) both the double mutant (E317K and G527A) and the C-terminal deletion mutant are important for for protein-protein interaction within the 6-protein complex and associated factors such as HP1a, and 4) HMR bridges heterochromatin and centromeres.

      **Major comments:**

      • Most of the key conclusions are supported by the evidence presented in this manuscript. The link between centromeres and HMR (and presumably the rest of the 6-protein complex) hinges only on colocalization IF and ChIP-seq data. The change in Hmr localization in cycling follicle vs endoreplicating cells of especially the dC mutant is very interesting. The loss of CENP-C signal correlates with a change in Hmr^dC signal. What exactly drives this change is not explored.

      We have shown in the past that HMR requires full length Cenp-C to localize to the centromere in S2 cells. We assume that this is also the case in the follicle cells. Therefore, the lack of Cenp-C recruitment in endoreplicating cells is likely the reason why HMR localizes primarily to HP1a containing heterochromatin. Differently from wild type HMR, HMRdC can’t bind LHR/HP1a as our AP-MS data show and therefore is not recruited to heterochromatin and diffuses away in later stages. We have described this point more extensively in the revised manuscript

      • The data presented in this manuscript are mostly clear (see minor comments) and appear to be reproducible, especially as the methods sections is detailed and both the ChIP-seq and mass-spec data is deposited in publicly accessible databases.
      • The rational why both HMR and LHR are overexpressed in cell lines is not clearly explained.

      As outlined in our response to reviewer 1 the overexpression of HMR and LHR was designed to simulate the hybrid situation, which shows an increase in HMR and LHR levels (Thomae, A. W. et al. Developmental Cell 27, 412–424 (2013)). We have indicated this in the revised manuscript.

      • The HMR/LHR overexpression experiment is very nice, and as one would expect, resulted in more protein interactions. Some of these might simply be the result from the abundance of HMR and LHR, which have saturated the core 6-protein complex. This leaves the question what is the true minimal size of the HMR/LHR complex? The dC mutant that removes the BESS domain as well as the double point mutations that disrupts the complex altogether, get to the importance of the stability of the complex and its association with especially HP1a. What the minimal interacting partners of HMR and LHR could be explored by knocking-down both factors and do mass-spec.

      We agree with the reviewer that the abundance of HMR and LHR results in a saturation of the core complex thereby having a spillover effect on other proteins. In this regard it is worth mentioning that the expression of the Hmr2 allele does not completely disrupt the complex but rather results in a loss of interactions with NLP, NPH, BOH1 and BOH2 while maintaining the interaction with LHR and HP1a. In fact, when the HMR2 protein is expressed, it shows a stronger interaction with known heterochromatic proteins than the wt protein (Figure 3B). As both mutant alleles show functional defects in pure species and in male hybrids we assume that HMR and LHR need to bind both chromatin types simultaneously. We consider the complex to be somewhat modular as we show that HMR and LHR can interact in isolation (Figure S4) while others have shown that LHR and HP1a, as well as NLP and NPH interact (**Greil, F. et al. EMBO J (2007); Anselm, E. et al. Nucleic Acids Research (2018)respectively). This is now pointed out in the revised manuscript

      • For the telomeric TE expression as well as offspring count shown in Figure 5D,E, a wild-type control would be informative as a measure how well the Hmr+/+ rescues both phenotypes.

      The misregulation of transposable elements (TE) and fertility defects of Hmr loss of function mutants have been previously characterized (Satyaki, P. R. V. et al. PLoS genetics (2014); Aruna et al.,Genetics (2009))**. We therefore rather focused on the relative expression of TEs in the HmrdC and Hmr2 mutants relative to the wild type rescue allele (Hmr+). Hmr2 serves as a known non-rescue allele (Aruna et al., 2009) in the fertility experiment, while in the TE experiment we describe for the first time a defect in TE repression for this allele.

      **Minor comments:**

      • In the opening paragraph of the introduction, the authors describe a scenario of sympatric speciation, which is subsequently highlighted by the speciation event between D. melanogaster and D. simulans. Yet, these two species have similar but not identical distribution range, leaving open the possibility the speciation event happened in parapatry. It might be worth rephrasing the first paragraph to leave open both modes of speciation, especially as the manuscript focuses on the mechanistic side of hybrid incompatability-associated proteins.

      We did not want to imply that our experiments allow a distinction between a sympatric or parapatric speciation. We thank the reviewer for pointing this out and rephrased the first paragraph accordingly.

      • Some of the abbreviations are repeated (e.g. SCC) others aren't introduced (e.g. HI). Overall, less abbreviations will make the text more readable, especially for non-experts.

      We tried to avoid acronyms wherever possible and got rid of the term SCC altogether. All acronyms are introduced at the first appearance.

      • In IF signal in Figure 4A is difficult to see on the black background. I would suggest either increasing the gain to improve the visibility of the signal or show in black-and-white. In addition, the colors should be labeled in the figure for clarity.

      We improved the quality of Figure 4A and labeled the different types of localization (see also answer to reviewer 1).

      • In Figure 5C the images for the Hmr^KO;Hmr^2 appears to be missing.

      See answer to reviewer 1 (4b). We have/will include the corresponding picture as supplementary material as we consider the characterization of the novel Hmr allele to be the main focus of the manuscript.

      In addition, for non-experts it might be helpful to mention which set of IF images are controls, rescues, and test, similar to what was done in Figure 5B.

      We have/will indicate which IF pictures are controls and rescue experiments

      Reviewer #3 (Significance (Required)):

      **Significance:**

      • This study provides novel insight how two factors involved in male hybrid lethality, with which chromatin factors they are associated, and how two mutants impact the chromatin localization and in vivo phenotypes.
      • Understanding the molecular basis of speciation is limited as most factors that drive speciation are not identified. Drosophila species are at the forefront of this research. Post-zygotic factors have predominantly found to have strong speciation potential. This work build very nicely on this work.
      • This manuscript will be predominantly interesting for the Drosophila chromatin field and speciation field.
      • I am trained in comparative genomic focusing on centromeric repeats and now study chromatin dynamics at the single molecule level, using cell biology, biochemical and biophysical tools.

      We thank the reviewer for appreciating our work. We think that our work will also be interesting for researchers focusing on centromere clustering and genome organization in general and independently of the Drosophila system.

      **Referees cross-commenting"

      Reviewer comments look reasonable to me- 1-3 months revision is not an undue burden, I think they can do at least some of what was requested. In response to Rev2: Agreed, they ought to tone it down

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Summary:

      The manuscript "The integrity of the speciation core complex is necessary for centromeric binding and reproductive isolation in Drosophila" by Lukacs and colleagues describes a study that show, by mass-spec and ChIP-seq, that two well established hybrid incompatibility proteins form a 6-protein complex that predominantly localizes near HP1a bound chromatin boundaries. With a C-terminal domain of HMR deleted, the 6-protein core complex was not disrupted, but its interaction and subsequent localization to HP1a domain near centromeres was lost. In addition, an HMR double mutant that disrupts the interaction between HMR and other components of the 6-protein core complex was tested and similar distribution patterns as for the dC mutant were observed. Next, the nuclear localization was HMR was tested in fruit fly follicle cells by IF. In endoreplicating cells, HMR-dC did not colocalize with HP1a, as did the double mutant. The expression level of several transposable elements (TEs) was assessed and only the full length wt Hmr transgene was able to rescue the repression of TEs, whereas neither the dC and double mutants did. When the number of offspring was assayed, a similar pattern was observed. Finally, male hybrid lethality was assayed by crossing D melanogaster mothers with different Hmr alleles with wt D simulans and only the wt Hmr allele resulted in male lethality, whereas both cD and double mutants resulted in 10-40% of the offspring to be male. These findings led the authors to conclude that 1) 6-protein speciation core complex containing HMR, LHR, NLP, NPH, and two uncharacterized proteins called BOH1 and BOH2, 2) overexpression of HMR/LHR results in novel interactions with other chromatin factors, 3) both the double mutant (E317K and G527A) and the C-terminal deletion mutant are important for for protein-protein interaction within the 6-protein complex and associated factors such as HP1a, and 4) HMR bridges heterochromatin and centromeres.

      Major comments:

      • Most of the key conclusions are supported by the evidence presented in this manuscript. The link between centromeres and HMR (and presumably the rest of the 6-protein complex) hinges only on colocalization IF and ChIP-seq data. The change in Hmr localization in cycling follicle vs endoreplicating cells of especially the dC mutant is very interesting. The loss of CENP-C signal correlates with a change in Hmr^dC signal. What exactly drives this change is not explored.
      • The data presented in this manuscript are mostly clear (see minor comments) and appear to be reproducible, especially as the methods sections is detailed and both the ChIP-seq and mass-spec data is deposited in publicly accessible databases.
      • The rational why both HMR and LHR are overexpressed in cell lines is not clearly explained.
      • The HMR/LHR overexpression experiment is very nice, and as one would expect, resulted in more protein interactions. Some of these might simply be the result from the abundance of HMR and LHR, which have saturated the core 6-protein complex. This leaves the question what is the true minimal size of the HMR/LHR complex? The dC mutant that removes the BESS domain as well as the double point mutations that disrupts the complex altogether, get to the importance of the stability of the complex and its association with especially HP1a. What the minimal interacting partners of HMR and LHR could be explored by knocking-down both factors and do mass-spec.
      • For the telomeric TE expression as well as offspring count shown in Figure 5D,E, a wild-type control would be informative as a measure how well the Hmr+/+ rescues both phenotypes.

      Minor comments:

      • In the opening paragraph of the introduction, the authors describe a scenario of sympatric speciation, which is subsequently highlighted by the speciation event between D. melanogaster and D. simulans. Yet, these two species have similar but not identical distribution range, leaving open the possibility the speciation event happened in parapatry. It might be worth rephrasing the first paragraph to leave open both modes of speciation, especially as the manuscript focuses on the mechanistic side of hybrid incompatability-associated proteins.
      • Some of the abbreviations are repeated (e.g. SCC) others aren't introduced (e.g. HI). Overall, less abbreviations will make the text more readable, especially for non-experts.
      • In IF signal in Figure 4A is difficult to see on the black background. I would suggest either increasing the gain to improve the visibility of the signal or show in black-and-white. In addition, the colors should be labeled in the figure for clarity.
      • In Figure 5C the images for the Hmr^KO;Hmr^2 appears to be missing. In addition, for non-experts it might be helpful to mention which set of IF images are controls, rescues, and test, similar to what was done in Figure 5B.

      Significance

      Significance:

      • This study provides novel insight how two factors involved in male hybrid lethality, with which chromatin factors they are associated, and how two mutants impact the chromatin localization and in vivo phenotypes.
      • Understanding the molecular basis of speciation is limited as most factors that drive speciation are not identified. Drosophila species are at the forefront of this research. Post-zygotic factors have predominantly found to have strong speciation potential. This work build very nicely on this work.
      • This manuscript will be predominantly interesting for the Drosophila chromatin field and speciation field.
      • I am trained in comparative genomic focusing on centromeric repeats and now study chromatin dynamics at the single molecule level, using cell biology, biochemical and biophysical tools.

      **Referees cross-commenting"

      Reviewer comments look reasonable to me- 1-3 months revision is not an undue burden, I think they can do at least some of what was requested. In response to Rev2: Agreed, they ought to tone it down

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      In this study, the authors identify a protein complex that contains hybrid incompatibility genes Hmr and Lhr, naming it SCC (Speciation Core Complex). This paper's major conclusions are: 1) overexpression of Hmr (which resembles the situation in hybrid, where hmr/lhr are overexpressed) results in ectopic protein-protein interaction. 2) Hmr's DNA binding domain (mutated in Hmr2) and C-terminal domain (known to interact with Lhr) are important for its function and in causing hybrid lethality.

      The identification of SCC complex is quite intriguing, but this paper does not cover much of functional significance of this complex at all. For example, does mutating other components of SCC complex (BOH1 etc) rescue hybrid lethality? Without examining these important issues, they instead drifted to study the domain function of Hmr. It is not so clear why these two lines of studies are glued together in one paper.

      It is not that I insist that the authors have to do all these experiments, but the assembly of the paper makes this paper quite inconclusive. After reading it, the readers are left behind wondering what is the function of SCC---and we do not even know whether 'speciation core complex' is a fair naming, without any knowledge whether any of the components being involved in speciation or not.

      Overall, this work contains a lot of important information, which promises future breakthrough on the subject matter. However, unfortunately, the study is not carried out to generate any conclusion and is fairly incomplete at this point.

      Specific comments.

      • Quality of Fig4A is too low. I cannot even tell where is the boundary of nucleus. Diffuse signal in category 'yellow' and 'grey'---are they entire cell or nucleus or nucleolus? Please add additional marker(s) for better interpretation of the Hmr signal presented.
      • In Fig4A and 5C, the localization of Hmr (wild type version) looks quite different in these two images. Which image is more 'representative' for Hmr localization? (as they build the logic on Hmr localization, this inconsistency is quite bothering). This might be cell-type-specific issue, but if so, how do we know the relevance of their localization? These issues make the result of localization analysis of wt/mutant Hmr inconclusive.

      Significance

      Hmr and Lhr are known as 'hybrid incompatibility genes', deletion of which rescues male hybrid lethality in Drosophila melanogaster/simulans hybrid crosses. Understanding the molecular function of Hmr and Lhr is expected to provide insights into the fundamental question of how two species become incompatible (i.e. how speciation occurs). This study investigates the protein complex that contains Lhr and Hmr, identifying a previously unidentified 'core' complex. Understanding the function of this complex may significantly advance our understanding of speciation.

      **Referees cross-commenting"

      I think all review comments are reasonable. However, I'd like to emphasize that the biggest issue with this paper is not about the data, but how the authors frame it. The term such as 'speciation core complex' is beyond 'hype' (not even 'exaggeration'). Simply there is no evidence that this term can be supported. I think the authors need to be more ethical. I would be surprised if authors truly believe they can claim that the term 'speciation core complex' is justifiable in science.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary

      How genes involved in hybrid incompatibility function within and across species remains incompletely characterized. This manuscript identifies two novel proteins (BOH1, BOH2) as well as three known proteins (LHR, NLP, NPH) as strong and reproducible interactors of the HMR hybrid incompatibility gene using AP-LC-MS in Drosophila S2 cells and labels these proteins as a 'speciation core complex' (SCC). The authors further show that HMR mutations (the previously identified HMR2 and a newly generated C-terminal truncation lacking the HMR BESS motif, HMRC) differentially disrupt these interactions and alter centromeric HMR localization in S2 cells and tissues. Much like previously described HMR mutations (e.g. HMR2), HMRC rescues HMR-mediated hybrid male lethality in D.melanogaster-D.simulans hybrids leading the authors to conclude that the integrity of the SCC is necessary for centromeric binding and reproductive isolation.

      Major comments:

      I think the experiments within the manuscript are generally of good quality and well controlled. However, I find that the authors' conclusions are very often not supported by the experiments performed (as detailed below) and I would strongly recommend that the authors stick to the conclusions that can be drawn based on the data they have generated. In my opinion, this manuscript contains findings that are of interest to the field but it needs to be rewritten with more justifiable conclusions.

      1) 'Speciation Core Complex' - The only link to speciation is the fact that the 'SCC' includes D.melanogaster HMR, a known hybrid incompatibility gene. On the other hand, all of these proteins have important functions in a pure species context and all of the interactions reported between the members of the SCC occur in a D.melanogaster background. Also, SCC assembly in viable/inviable hybrids is not tested. Essentially, I would come up with a different and more functionally consistent name for the complex. I highly recommend against naming these stable interactors as the 'SCC' unless the authors can show that mutating any of the other 'SCC' proteins (specifically NLP, NPH, BOH1 & BOH2), which should presumably also disrupt SCC formation, leads to the rescue of hybrid male lethality?

      2) Is it a stable 6-membered complex? - The only line of evidence for the presence of a stable complex between all 6 proteins are the MS data from Figure 1C and Figure S1A-C. Although I don't think it is necessarily required, a biochemical demonstration that these proteins co-sediment at a high MW would be a much stronger indication of complex formation. That being said, I think the authors can use their expertise in AP-LC/MS to more comprehensively characterize complex formation.

      a) For example, the authors could test whether loss of BOH1/BOH2 in S2 cells impacts complex formation. A reduction of interactions between other complex members would strengthen the authors' conclusion of a stable and stoichiometric 6-membered complex.

      b) Additionally, I would suggest that they use one (or more) of BOH1/BOH2/NLP/LHR as baits in the S2 cells expressing HMR mutations (HMR2 and HMR C, Figure 3) to test complex formation. Beyond Figs. 1 and S1, the authors only test one-way interactions between HMR (or HMR mutants) and the other 5 binding partners. It is unclear if the other 5 'SCC' members are capable of binding each other when HMR is mutated. As a result, how HMR affects the ability of other proteins to interact with each other and its role in complex formation remains somewhat unclear. This is particularly important since the authors conclude in the discussion that "HMR acts as a molecular bridge between different modules of the SCC" and that "the integrity of the SCC is essential for its function".

      3) Centromeric vs heterochromatic localization of HMR - There appears to be some differences between Hmr localization across different tissues as the authors have noted in their introduction. In this manuscript, the authors assess HMR localization in S2 cells as well as mitotic and endocycling follicle cells from various stages of oogenesis. In these cell types, the authors compare HMR localization to both Cenp-C (centromere) and HP1 (constitutive heterochromatin). In my opinion, it is not easy to get a clear perspective on what the authors consider to be HMR's true localization in these cells and tissues. I would recommend the following straightforward changes/experiments related to this point,

      a) Label the image categories in Figure 4A. Please also describe in detail the classification criteria were used to separate these image categories from one another.

      b) I would also move Figure S7A to the main text since it demonstrates centromeric colocalization of HMR in early follicle cells.

      c) Use linescans on existing images to better demonstrate colocalization between Hmr and Cenp-C and/or HP1.

      d) Show Cenp-A and HMR staining for the images in Figure 5C and stage 10 follicle cells from Figure S7A.

      e) I feel the authors do not spend enough time discussing the fact that HMRC still appears to localize to centromeres at most follicle cells upto Stage 7.

      In sum, it would also be nice for the authors to take a clear position on whether HMR is centromeric, heterochromatic or both in the cells they analyze by microscopy and why these localizations may change between the cells they have looked at.

      4) HMR2 analyses - I think HMR2 is an important mutant to include as a control for HMRC, especially since the authors should already have the required strains/data. I specifically mean the following,

      a) Figure 4C - Please add HMR2 ChIP-seq tracks only if the authors already have this data.

      b) Figure 5C and Figure S7B - Add HMR2 IF images. Please also discuss HMR2 localization to centromeres and heterochromatin.

      c) Figure 5E - Increase n's for the HMR2 fertility assay.

      5) HMR localization in female germline cells - Given that the authors indicate that female fertility and telomeric transposon suppression are compromised with HMR2 and HMRC, I think it would strengthen the manuscript to address HMR localization with respect to heterochromatin and centromeres in the nurse cells and/or oocytes.

      6) I find the last part of the abstract and discussion i.e. HMR bridges heterochromatin and the centromere, to be very speculative based on the data presented. As far as I can tell, the only experimental basis for this conclusion is the fact that HMR binds known centromeric and heterochromatic proteins. With this logic, you could easily make a similar argument for the numerous proteins that colocalize with centromeric and pericentromeric heterochromatin. Personally, I would not speculate extensively on a HMR bridging activity without more compelling functional readouts.

      Minor comments:

      1) Intersection plot - I would explain the intersection plot on Figure 1C more thoroughly (I found it confusing).

      2) Image colours - The images in Figure S2 and Figure S7 are hard to interpret due to the colours used for the HA and Hmr channel respectively. I would use the white pseudo-colour for DAPI and omit this channel from the merged image and insets (a line demarcating the nucleus would suffice in the merged image). In addition, a linescan would better represent colocalizations or lack thereof.

      3) I'm not convinced that one can determine stoichiometry and sub-stoichiometry of protein complexes based on spectral counts; spectral counts could be affected by other factors. Therefore, I would hesitate to use "However, HP1a is only present in sub-stoichiometric amounts in the AP-MS purifications with antibodies against the SCC...."

      4) Ambiguity in description of methods - In the methods section 'Crosses for generating Hmr genotypes for hybrid viability assays', the authors state that "In the rescue experiment, Hmr+ served as a positive (lethality rescue) and Hmr2 as a negative control (no lethality rescue)". The authors might consider rewording this as I think it's a bit strange to refer to hybrid male lethality as a rescued state.

      Significance

      Nature and Significance of the advance:

      This work adds to the study of reproductive isolation in Drosophila by defining a stable set of molecular interactors of the HMR hybrid incompatibility protein. In my opinion, this study offers a platform for future research into the poorly understood molecular events that trigger hybrid incompatibility in Drosophila. In addition, the authors generate a novel HMR mutation (HMRC) that also rescues hybrid male lethality and it would be interesting to determine in finer detail how closely this mutation mimics other known HMR mutations. A characterization of BOH1/BOH2 would have also significantly strengthened the manuscript.

      My Expertise:

      Satellite DNA repeats, Chromocenters, Speciation, Hybrid Incompatibility

      **Referees cross-commenting"

      I also agree that all the reviewer comments are reasonable. The manuscript would be significantly improved by making conclusions that can be supported by the data. I think some additional experiments are also warranted to make the paper more robust.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      We thank the reviewers for their positive comments on our manuscript. To address their criticisms, we propose to do the following experiments:

      Reviewer 1 (mi__nor comments)__:

      1. In Fig. 1, the authors show that Btz-WT, but not Btz-HD, localizes to the posterior pole of the oocyte. Do the authors see Btz-WT and/or Btz-HD localized to MNs/muscles/glia at the NMJ? We have had difficulty detecting the expression of our Btz-GFP transgenes at the NMJ. In case this was due to competition with endogenous wild-type Btz, we will repeat the staining in a btz mutant background. If the protein is still undetectable, we can include data showing the localization of UAS-Btz-GFP when overexpressed in muscles or motor neurons.

      The mitochondrial phenotypes observed in Btz mutants are striking. But it seems possible that there are defects in overall mitochondrial levels in muscle in addition to defects in their localization. Overall, mitochondrial levels seemed reduced in Btz mutants. Is it possible to do a ATP5A immunoblot in Btz mutants to test whether overall mitochondrial levels are altered?

      We will do a Western blot to compare ATP5A levels in btz2/+ and btz2/Df(3R)BSC497 larval carcasses.

      ECM proteins are known to be critical for regulating TGFB signaling. That, taken with the multi-tissue genetic requirement for Btz, suggests that Btz might directly regulate either Ltl or Frac RNA, given that these ECM proteins are likely deposited by multiple cell types.

      We agree that this is a possibility and we will mention it in the Discussion.

      Reviewer 2 (major comments):

      1. In Figure 1, regarding the validation of rescue constructs: the EJC interaction-defective mutant is based solely on conservation, as all structural/interaction studies cited with Btz bound to EJC have been with human proteins. They use Vasa localization as a readout of EJC-dependent function, but this is indirect and only assesses one aspect of EJC function (localization). Since many of the main conclusions in the paper are predicated on this mutant being EJC-independent, they should validate this with the Drosophila orthologs using immunoprecipitation. They demonstrate the capability of expressing GFP-tagged versions of Casc3 WT and mutant in S2 cells, so this should not be a cumbersome control experiment to include. We will express tagged Btz-WT and Btz-HD proteins in S2 cells and test whether they can be co-immunoprecipitated with Myc-tagged Drosophila eIF4AIII.

      Regarding Figure 3, it could be postulated that the number of boutons would be influenced by the length of axons. Is axon outgrowth accounted for in these experiments? This would influence number of synaptic boutons. Panel F looks very different from panel A in terms of axon length (could this be due to axon outgrowth defect and/or impacted muscle size?) Can quantitation be done also by normalizing to axon length (bouton number/axon length)? Or perhaps this is accounted for in muscle size? If so, this should be explained.

      • *

      The NMJ grows during development by adding both axonal branches and synaptic boutons, so its size can be measured by counting the number of boutons or branches or measuring branch length. These measures are usually well correlated. In this paper we used bouton number normalized to muscle surface area as our measure of NMJ size, but we did observe corresponding changes in the number and length of branches, as the reviewer points out. We will explain this more clearly in the text.

      In Figure 3 quantification: n's vary between genotypes significantly, and this should be explained (e.g. was there a recovery issue between genotypes or just fewer needed for WT-like?).

      • *

      The btz mutant larvae are more difficult to dissect due to muscle fragility, and some crosses in this genetic background may have yielded fewer usable filets than desired. We believe the numbers we obtained are sufficient to show which differences are significant.

      In Figure 4 panels B and F (mutants), there appears to be reduced axon outgrowth (see point above). This should be taken into account when expressing bouton number.

      • *

      As explained in our response to point 2, axon length and bouton number are correlated measures of synapse size and vary together in this figure as expected.

      The RNA-seq data (Figure 5) has a potential issue in that they used larvae with a balancer chromosome (Df), which yields a 50% reduction in any genes on that chromosome. They acknowledge this and removed these genes from the analysis, but the concern remains that this still might be a confounding variable (for example, if reduction in any of these genes might disrupt a signaling pathway). We do not think that the RNA-seq needs to be repeated, but we propose that the authors validate these targets using qPCR in their MN-specific btz knockdown system (this way, they can also include magoh and eif4aIII knockdowns for comparison).

      • *

      Because only one btz allele was available, we used transheterozygotes with a deficiency for the region to avoid homozygosing other mutations that might be present on the btz2 chromosome. As a consequence, we did observe reduced expression of genes located within the deficiency (which covers a small region, not an entire chromosome), and it is possible that this might contribute to the phenotype. However, we have seen a similar reduction in NMJ size in btz2 homozygotes. We do not think that motor neuron-specific btz knockdown is a useful genotype to validate the RNA-Seq results because ltl and frac levels do not change significantly in the CNS, only in muscle, and knockdown only in motor neurons would be unlikely to change daw levels measured in the whole CNS. Knocking down mago or eIF4AIII in muscle is lethal before the third larval instar stage, preventing us from comparing their effects on gene expression to those of btz. However, we will do qRT-PCR to measure daw, ltl and frac mRNA levels in btz2 homozygous mutant muscles.

      Reviewer 2 (minor comments):

      1. *Some statements made in the introduction that are not entirely accurate: **

        "A fourth core subunit, known as Barentsz (Btz), Cancer susceptibility candidate gene 3 (CASC3), or Metastatic lymph node 51 (MLN51), associates with the complex following the completion of splicing, and is required for the effects of the EJC on translation, NMD and mRNA localization (Chazal et al., 2013; Palacios et al., 2004; Shibuya et al., 2006; van Eeden et al., 2001)."

        A recent study indicates that Casc3 is not required for EJC-dependent NMD targets in human cells, but rather enhances NMD on a subset of targets (Gerbracht et al. 2020 NAR). Perhaps "is required" should be changed to "plays a role in cytoplasmic EJC-mediated processes, such as...". It has also been shown that EJC core can assemble without Casc3 (e.g. Ballut et al 2005 NSMB, Gehring et al 2009 PLoS Biol). Previous work from the authors show that Casc3 (Btz) is not necessary for EJC function in pre-mRNA splicing (Roignant et al, 2010 Cell). Further, there exists a population of Casc3 lacking EJCs in human cells (Mabin et al 2018 Cell Reports). Collectively, all this evidence points to Casc3 not being a core EJC subunit. *

      • *

      We will change the text so that we do not refer to Btz/Casc3 as a core subunit.

        • "In the mouse brain, haploinsufficiency for Magoh, Rbm8a or Eif4a3 causes severe microcephaly, but complete loss of Casc3 has a much milder effect that can be attributed to developmental delay (Mao et al., 2017; Mao et al., 2016; Mao et al., 2015; Silver et al., 2010)."

        From Mao et al 2017: complete loss and hypomorphic mutants were embryonic and perinatally lethal (contrary to what the authors are stating here), while compound mutants and heterozygotes exhibited neurodevelopmental delay. By "milder effects" the authors could also be referring to brain size being proportional to body size in the complete loss homozygotes; either way, this should be clarified. *

      • *

          By “milder effects” we meant the effect on brain size. We will clarify this in the revised text.
        

      Fly-specific nomenclature could be made more accessible to a broader audience, as the full readership will likely not have expertise in Drosophila genetics. For example, w118, btz2 labels used in figures are not explained anywhere in the manuscript. While the authors do a good job of describing various mutants in a more accessible fashion in the results section, the genotype labels in figures can be better explained in the legends.

      We apologize for this and will clarify the genotype labels in the figure legends.

      Fig 2 L-N panels might warrant more explanation. Can the mitochondria be counted here? Is there also a difference in volume/morphology that could be quantitated? In Figure 2N, muscle fibers are more densely packed in mutant vs. control; can this be explained?

      • *

      We are hesitant to quantify mitochondria or comment on muscle fiber packing based on the EM images, because only one individual of each genotype was examined. We prefer to simply use these images to provide a higher resolution view of the change in mitochondrial distribution that we observed and quantified using light microscopy. However, we do plan to do a Western blot to determine whether there are changes in the number of mitochondria in btz mutants (see Reviewer 1 point 2).

      In Fig 2, to draw parallels between panels A-K and L-N, it might also be helpful to use the red/yellow arrow system on panel A for comparison.

      This is a good suggestion that we will follow.

      In Figure 3, it might be helpful for a general audience to include zoomed-in picture of boutons (as in Fig 5B), as some panels appear to have less defined bouton shape.

      • *

      We do observe that boutons tend to be less well separated from each other in btz mutants, and will include zoomed-in pictures to document this.

      Is the bouton size different in the mutant in Figure 3? Can this be quantified?

      We do not think that there is a significant difference in bouton size in btz mutants, but we will measure this and include a quantification.

      Fold changes are modest and not very apparent in staining (we acknowledge that this could be due to early developmental time point). Images could better point out differences in WT vs. mutant that are not readily apparent to those outside the fly neurodevelopment audience.

      Because of the inherent variability in synapse shape, it can be difficult to appreciate changes in bouton number from a single image. However, our quantifications show that the changes are consistent and significant.

      Fig 4 NMJs are shown on different scale (more zoomed in) than in Figure 3, and differences are bit easier to see at this scale. Presenting Fig 3 on this scale might help the reader with visualizing the differences in WT versus mutant.

      • *

      We will crop the images in Figure 3 so as to show them at the same scale as in Figure 4.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary

      Ho et al. describes the developmental functions of the Drosophila Casc3 ortholog, Barentsz (Btz) using in vivo loss-of-function and rescue experiments in Drosophila larvae. In this study, the authors find that loss of Casc3 contributes to neuromuscular defects in the larval fly. Utilizing transgenics of WT and EJC interaction-defective mutants, they demonstrate that Btz has both EJC-dependent and independent functions in the larval neuromuscular junction, wherein muscle defects are EJC dependent and synaptic defects are EJC-independent. Using RNA-seq, they find that upregulated mRNAs include those that belong to the Activin signaling pathway. They go on to find that the neuromuscular defects in Btz mutants can be attributed to dysregulation of Activin signaling, and are rescued with loss of the Activin ligand, Dawdle (Daw).

      Major Comments

      Overall, the paper presents well-controlled experiments that support the main conclusions. We propose achievable validation experiments that we believe will strengthen the conclusions of the paper. There is some concern that the magnitude of the effects are overstated, or could be made more apparent to a broader audience (i.e. those in the mRNA regulation field beyond Drosophila geneticists).

      • In Figure 1, regarding the validation of rescue constructs: the EJC interaction-defective mutant is based solely on conservation, as all structural/interaction studies cited with Btz bound to EJC have been with human proteins. They use Vasa localization as a readout of EJC-dependent function, but this is indirect and only assesses one aspect of EJC function (localization). Since many of the main conclusions in the paper are predicated on this mutant being EJC-independent, they should validate this with the Drosophila orthologs using immunoprecipitation. They demonstrate the capability of expressing GFP-tagged versions of Casc3 WT and mutant in S2 cells, so this should not be a cumbersome control experiment to include.

      • Regarding Figure 3, it could be postulated that the number of boutons would be influenced by the length of axons. Is axon outgrowth accounted for in these experiments? This would influence number of synaptic boutons. Panel F looks very different from panel A in terms of axon length (could this be due to axon outgrowth defect and/or impacted muscle size?) Can quantitation be done also by normalizing to axon length (bouton number/axon length)? Or perhaps this is accounted for in muscle size? If so, this should be explained.

      • In Figure 3 quantification: n's vary between genotypes significantly, and this should be explained (e.g. was there a recovery issue between genotypes or just fewer needed for WT-like?).

      • In Figure 4 panels B and F (mutants), there appears to be reduced axon outgrowth (see point above). This should be taken into account when expressing bouton number.

      • The RNA-seq data (Figure 5) has a potential issue in that they used larvae with a balancer chromosome (Df), which yields a 50% reduction in any genes on that chromosome. They acknowledge this and removed these genes from the analysis, but the concern remains that this still might be a confounding variable (for example, if reduction in any of these genes might disrupt a signaling pathway). We do not think that the RNA-seq needs to be repeated, but we propose that the authors validate these targets using qPCR in their MN-specific btz knockdown system (this way, they can also include magoh and eif4aIII knockdowns for comparison).

      Minor comments

      Some statements made in the introduction that are not entirely accurate:

      • "A fourth core subunit, known as Barentsz (Btz), Cancer susceptibility candidate gene 3 (CASC3), or Metastatic lymph node 51 (MLN51), associates with the complex following the completion of splicing, and is required for the effects of the EJC on translation, NMD and mRNA localization (Chazal et al., 2013; Palacios et al., 2004; Shibuya et al., 2006; van Eeden et al., 2001)."

      A recent study indicates that Casc3 is not required for EJC-dependent NMD targets in human cells, but rather enhances NMD on a subset of targets (Gerbracht et al. 2020 NAR). Perhaps "is required" should be changed to "plays a role in cytoplasmic EJC-mediated processes, such as...". It has also been shown that EJC core can assemble without Casc3 (e.g. Ballut et al 2005 NSMB, Gehring et al 2009 PLoS Biol). Previous work from the authors show that Casc3 (Btz) is not necessary for EJC function in pre-mRNA splicing (Roignant et al, 2010 Cell). Further, there exists a population of Casc3 lacking EJCs in human cells (Mabin et al 2018 Cell Reports). Collectively, all this evidence points to Casc3 not being a core EJC subunit.

      • "In the mouse brain, haploinsufficiency for Magoh, Rbm8a or Eif4a3 causes severe microcephaly, but complete loss of Casc3 has a much milder effect that can be attributed to developmental delay (Mao et al., 2017; Mao et al., 2016; Mao et al., 2015; Silver et al., 2010)."

      From Mao et al 2017: complete loss and hypomorphic mutants were embryonic and perinatally lethal (contrary to what the authors are stating here), while compound mutants and heterozygotes exhibited neurodevelopmental delay. By "milder effects" the authors could also be referring to brain size being proportional to body size in the complete loss homozygotes; either way, this should be clarified.

      General minor comments:

      • Fly-specific nomenclature could be made more accessible to a broader audience, as the full readership will likely not have expertise in Drosophila genetics. For example, w118, btz2 labels used in figures are not explained anywhere in the manuscript. While the authors do a good job of describing various mutants in a more accessible fashion in the results section, the genotype labels in figures can be better explained in the legends.

      • Fig 2 L-N panels might warrant more explanation. Can the mitochondria be counted here? Is there also a difference in volume/morphology that could be quantitated? In Figure 2N, muscle fibers are more densely packed in mutant vs. control; can this be explained?

      • In Fig 2, to draw parallels between panels A-K and L-N, it might also be helpful to use the red/yellow arrow system on panel A for comparison.

      • In Figure 3, it might be helpful for a general audience to include zoomed-in picture of boutons (as in Fig 5B), as some panels appear to have less defined bouton shape.

      • Is the bouton size different in the mutant in Figure 3? Can this be quantified?

      • Fold changes are modest and not very apparent in staining (we acknowledge that this could be due to early developmental time point). Images could better point out differences in WT vs. mutant that are not readily apparent to those outside the fly neurodevelopment audience.

      • Fig 4 NMJs are shown on different scale (more zoomed in) than in Figure 3, and differences are bit easier to see at this scale. Presenting Fig 3 on this scale might help the reader with visualizing the differences in WT versus mutant.

      Significance

      Overall, this paper contributes conceptually to understanding EJC-mediated mRNA regulation during development. The contribution here is incremental, but meaningful in terms of defining the scope of regulation by the EJC and its peripheral factors in various contexts. These findings will likely be of interest to the fields of RNA metabolism and neurodevelopment. It also adds to the existing work suggesting Casc3 may have additional functions outside of the EJC (e.g. Mao et al. 2017 RNA, Baguet et al 2007 J Cell Sci, Cougot et al. 2014 J Cell Sci); while these previous studies have suggested Casc3 roles in development and mRNA localization/granule formation that are different from the EJC core proteins, this study more directly tests an EJC-independent role in mRNA regulation of specific targets. Further addressing the molecular basis of this regulation will be outside the scope of this article but will be of interest to the field.

      We are molecular biologists who study NMD and are thus equipped to address the EJC-related molecular functions and impact on the transcriptome. We do not have expertise in Drosophila genetics or neurobiology, and thus cannot critically evaluate the specific genetic approaches used or anatomy presented to the full extent. We have, however, pointed out areas that need elaboration regarding the genetic approaches and/or presentation of data that may be unfamiliar to a broader audience (i.e. the RNA metabolism field).

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      The Ho et al. manuscript defines developmental functions for Barentsz (Btz), a core subunit of the EJC. While other EJC components, such as eIF4AIII, have been shown to have EJC-independent functions, it has not been clear whether Btz also acted independently of this multi-protein complex. The authors make use of two Btz genomic constructs, a wild-type transgene (Btz-WT) and a transgene carrying mutations in the two eIF4AIII-interacting residues (Btz-HD) to rigorously whether or not Btz has any functions independent of the EJC. Interestingly, they show that while Btz-HD does not rescue Btz functions in the ovary or the muscle, it does rescue Btz functions at the larval NMJ. They back up the conclusion that Btz activity at the NMJ is independent of the EJC by showing that the growth phenotype observed in Btz mutants is not shared by mutants in other EJC components. How does Btz regulate NMJ development? The authors performed an RNAseq experiment and found that several components of an Activin/TGFB pathway. Strikingly, they find that Activin overexpression rescues the NM phenotype in Btz mutants, consistent with its identification in the RNAseq analysis.

      This is a very logical and well-constructed paper. The results are well-controlled and convincing. Overall, the manuscript was a delight to read and makes an important contribution to dissecting the function of RNA-binding/associated proteins in neuronal development. I have only a few comments that could be considered prior to publication.

      Minor comments:

      1. In Fig. 1, the authors show that Btz-WT, but not Btz-HD, localizes to the posterior pole of the oocyte. Do the authors see Btz-WT and/or Btz-HD localized to MNs/muscles/glia at the NMJ?
      2. The mitochondrial phenotypes observed in Btz mutants are striking. But it seems possible that there are defects in overall mitochondrial levels in muscle in addition to defects in their localization. Overall, mitochondrial levels seemed reduced in Btz mutants. Is it possible to do a ATP5A immunoblot in Btz mutants to test whether overall mitochondrial levels are altered?
      3. ECM proteins are known to be critical for regulating TGFB signaling. That, taken with the multi-tissue genetic requirement for Btz, suggests that Btz might directly regulate either Ltl or Frac RNA, given that these ECM proteins are likely deposited by multiple cell types.

      Significance

      This paper establishes novel functions for the EJC complex protein Btz, and also delineates which functions depend on the EJC and which are independent. This is significant because there is intense interest in how post transcriptional regulation contributes to neuronal development. The paper fits with a body of literature dissecting neuronal functions for EJC proteins. It represents an important addition to this body of work.

      The audience will be molecular neuroscientists, especially those with interests in novel genetic regulatory mechanisms.

      My expertise is in developmental genetics and molecular neurobiology.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reviewer #1

      1. One key citation missing from the current manuscript is from Hwang et al. 2014 (PMID 25288734). This study has already described that the isp-1 mutant strain survives longer during P. aeruginosa infection. This citation also describes that the gene expression profile of isp-1 mutants animals includes a considerable number of pathogen-responsive genes that are similarly induced during infection. While the current manuscript does go into the mechanism of this resistance with more detail, they should amend the language to more appropriately reflect previous work, notably the above reference.

      We apologize for the oversight and have added the suggested citation. Hwang et al. show that isp-1 worms have increased resistance to bacterial pathogens that is dependent on HIF-1/HIF1 and AAK- 2/AMPK. In future work, it will be interesting to examine whether HIF-1 and AAK-2 act in concert with, or independently of, ATFS-1 and the p38-mediated innate immune signaling pathway to mediate pathogen resistance and longevity in isp-1 worms. We will add these points to our discussion.

      1. The authors suggest that ROS activation of the p38 MAPK pathway is likely not the mechanism that explains the resistance of long-lived mitochondrial mutant animals due to their reduced food intake. However, is ROS production nonetheless involved? Does antioxidant treatment suppress the increased resistance during infection of isp-1 and/or nuo-6 mutant animals?

      To address this question, we will treat wild-type, isp-1 and nuo-6 worms with antioxidant and then measure resistance to bacterial pathogens using the P. aeruginosa strain PA14 slow kill assay. For the antioxidant treatment, we will use 10 mM Vitamin C as we have previously shown that this concentration is effective at reducing ROS in isp-1 worms to decrease isp-1 lifespan (Van Raamsdonk and Hekimi 2012, PNAS). Although antioxidant treatment can have pleiotropic effects, if this decreases survival of bacterial pathogen exposure, it will suggest that the elevated ROS production in isp-1 and

      nuo-6 worms may contribute to their enhanced bacterial pathogen resistance.

      1. (line 278-282): the authors should elaborate on how the p38 MAPK pathway plays a permissive role. It is intriguing that ATFS-1 and ATF-7 are both bZIP transcription factors that could theoretically heterodimerize and that they share common immune gene targets. The authors do indicate that the binding sites for ATFS-1 and ATF-7 are very different and are likely acting distinctly but some speculation would nonetheless strengthen this statement.

      While ATFS-1 and ATF-7 were shown to bind to the promoter regions of the same innate immunity genes, the apparent consensus binding sites are different suggesting that they bind to different regions of the promoter. One way in which the p38 MAPK pathway may be playing a permissive role is that ATF- 7 binding and relief from its repressor activity is required for any transcription of p38-mediated innate immunity target genes to occur. This is consistent with our data showing that disruption of nsy-1, sek-1, pmk-1 or atf-7 decreases the expression of innate immunity genes in wild-type worms. In contrast, it may be that the role of ATFS-1 is for enhanced expression of innate immunity genes such that when ATFS-1 is bound to the promoter region, or perhaps enhancer elements, the baseline expression of innate immunity genes that results from the binding of ATF-7 is increased. This idea is supported by our data showing that disruption of atfs-1 does not affect the expression of innate immunity genes in wild- type worms but prevents nuo-6 mutants from having increased expression. We will update our manuscript to include these points.

      1. The authors suggest that reduced food consumption of nuo-6 and isp-1 animals may suppress ROS- induced activation of the p38 innate immune pathway. It is intriguing that dietary restriction was previously shown to increase resistance to infection, presumably through p38-independent mechanisms (PMID 30905669). It would be interesting to measure host survival of nuo-6 and isp-1 mutant animals that are dietary-restricted to see if the enhanced survival rates conferred by mitochondrial stress and DR are additive or not.

      According to this suggestion, we will compare the bacterial pathogen resistance of wild-type, isp-1 and nuo-6 worms that have undergone dietary restriction to the same strains under ad libitum conditions. This will determine the extent to which their enhancement of pathogen resistance might be additive.

      1. Figure 2: It is intriguing that loss of p38 signaling appears to have different effects in nuo-6 versus isp-1 animals. Specifically, loss of p38 signaling in isp-1 mutants renders them more sensitive to infection than wild-type, whereas it generally suppresses survival rates back to wild-type levels in the nuo-6 mutant background. Even within the nuo-6 mutant group, loss of SEK-1 has more dramatic effects on nuo-6 mutant animals than does loss of NSY-1, PMK-1 or ATF-7(gf). This is despite the fact that the nsy-1, sek-1, and pmk-1 alleles that are used in this study are all reported to be null. Can the authors speculate on these differences?

      While the isp-1 and nuo-6 mutations both alter mitochondrial function, they affect different components of the electron transport chain. isp-1 mutations affect Complex III (Feng et al. 2001, Dev. Cell), while nuo-6 mutations affect Complex I (Yang and Hekimi 2010, Aging Cell). Although these mutants both have increased lifespan and a similar slowing of physiologic rates, it is not uncommon to observe differences between these mutants. For example, while treatment with the antioxidant NAC completely reverts nuo-6 lifespan to wild-type, it only partially reduces isp-1 lifespan (Yang and Hekimi 2010, PLoS Biology), suggesting that nuo-6 lifespan may be more dependent on ROS than isp-1. We have recently shown that deletion of atfs-1 reduces nuo-6 lifespan, but completely prevents isp-1 worms from developing to adulthood (Wu et al. 2018, BMC Biology), suggesting that isp-1 worms are more dependent on ATFS-1 than nuo-6 worms. Disruption of sek-1 has a greater impact on pathogen resistance than nsy-1 and pmk-1 because SEK-1 is absolutely required for innate immune signaling, while some partial redundancy exists for NSY-1 and PMK-1. We will add these points to our manuscript.

      1. One of the main conclusions from this study is that ATFS-1 likely binds directly to innate immune genes that are in common with ATF-7. Since this is such a pivotal finding, the authors should validate some candidate genes from the referenced ChIP seq datasets using ChIP qPCR. Also, are there predicted ATFS-1 binding sites (PMID 25773600) in these promoters?

      Our data shows that activation of ATFS-1 increases the expression of innate immunity genes without increasing activation of p38. The simplest explanation for this observation is that ATFS-1 can upregulate the same innate immunity genes as ATF-7. Accordingly, we hypothesized that ATFS-1 and ATF-7 can bind to the same promoter. Fortunately, two previous ChIP-Seq studies, from well-established laboratories who have extensive experience studying ATFS-1 and ATF-7, had already determined which genes are bound by these two transcription factors (Nargund et al. 2015, Molecular Cell; Fletcher et al. 2019, PLoS Genetics). Comparing the results of these two published studies confirmed our hypothesis by demonstrating that the same innate immunity genes are bound by both ATF-7 and ATFS-1 in vivo. In order to provide additional support for the conclusion that ATFS-1 and ATF-7 can bind to the same genes, we will examine the genetic sequence of innate immunity genes that were shown to be bound by both ATFS-1 and ATF-7 in the published ChIP-seq studies to identify predicted binding sites for ATFS-1 and ATF-7, while noting that the ATFS-1-associated sequence is an enriched motif and not an established binding site. If we are able to identify the predicted binding sites for these two transcription factors in the same gene, it will provide further support for the conclusion that these transcription factors can both bind to the same innate immunity genes.

      Reviewer #2

      1. The authors state that the p38 MAPK PMK-1 is not activated in the long-lived mitochondrial mutants. However, it might be better to state that there is "no enhanced activation" of PMK-1, since they clearly show in nuo-6 and isp-1 mutants the presence of phosphorylated PMK-1 (Fig. 4A), which would indicate an activated form of PMK-1 in these mutants.**

      According to this suggestion, we will change the text to indicate that there is no enhanced activation of PMK-1 in nuo-6 and isp-1 worms.

      1. Are the food-intake behaviors of all mutants in liquid culture (Fig. 4B-F) the same as their food- intake behaviors on solid agar media, the environment where pathogen resistance was measured?

      We previously compared assays measuring food intake on solid agar media versus the liquid culture approach used in the current study to determine which method is the most robust (Wu et al. 2019, Cell Metabolism). While both assays produced similar results, performing the food intake assay on solid agar plates was much more variable as it is challenging to scrape off all of the uneaten bacteria from solid plates in order to measure it. Since the approach of measuring food intake in liquid media produces more consistent and reliable results, we chose to use this assay for the current study. We will update our manuscript to include this justification.

      1. Does the p38 pathway single mutant nsy-1 or sek-1 live shorter than wild type on dead E. coli OP50 (Fig. S9) than they do on live OP50 (Fig. 3)? If so, what might that mean? These mutants are also living shorter than wild type on PA14 (Fig. 2), but live as long as wild type on OP50 (Fig. 3). What is in the live OP50 that allows these mutants to live like wild type?

      In a previous publication, we found that sek-1 mutants live shorter than wild-type worms, and nsy-1 live slightly shorter than wild-type worms in a lifespan assay performed in liquid medium with dead OP50 bacteria (Wu et al. 2019, Cell Metabolism). In the current study, we performed lifespan assays on solid NGM plates with live OP50 bacteria and observed a wild-type lifespan in sek-1 and nsy-1 worms. Since there are multiple experimental variables that are different between the previous and current study, most notably liquid versus solid media, the lifespan results cannot be directly compared. In the case of measuring survival of these strains on PA14, the simplest explanation is that they are dying sooner because their innate immune signaling pathway is disrupted, and so they are less able to mount an immune response against the pathogenic bacteria. We will update our manuscript to include these points.

      At the same time, wouldn't it be simpler to call the multiple antibiotic-treated OP50 as "dead bacteria", instead of "non-proliferating bacteria"? Some of the antibiotics used to treat OP50 are bactericidal and not bacteriostatic.

      We previously monitored the OD600 of the antibiotic-treated, cold-treated OP50 that we used in our experiment, and found that there is only a very small decrease in OD600 after 10 days (Moroz et al. 2014, Aging Cell). Since dead bacteria are rapidly broken down leading to a decrease in OD600, this result is consistent with the bacteria being alive but not proliferating. We will include this point in our manuscript.

      1. Since nuo-6 and isp-1 do not always behave exactly the same in their dependence on certain genes (e.g., Fig. 2C vs Fig 2D), what happens in isp-1; atfs-1 double mutants? Do these mutants behave in the same manner as nuo-6; atfs-1?

      This is an interesting question. Unfortunately, isp-1;atfs-1 mutants arrest during development (Wu et al. 2018, BMC Biology), which is why we only examined the effect of atfs-1 deletion in nuo-6 mutants. We will update the manuscript to note this point.

      Regarding nuo-6; atfs-1, why does the double mutant live shorter on PA14 than either single mutant (Fig. 6A)? Is this because atfs-1 is needed to activate the p38 MAPK-dependent and -independent pathways?

      It is possible that the nuo-6 mutation makes worms more sensitive to bacterial pathogens, perhaps due to decreased energy production, and that activation of ATFS-1 is required not only to enhance their resistance to pathogens but also to increase their resistance back to wild-type levels. In a previous study, we showed that loss of ATFS-1 slows down the rate of nuclear localization of DAF-16. Thus, loss of atfs-1 may also be decreasing resistance to bacterial pathogens by diminishing the general stress resistance imparted by the DAF-16-mediated stress response pathway. We will update the manuscript to include these points.

      In Fig. 7B, the atfs-1(gof) appears to have slightly more phosphorylated p38 compared to wild type, although it is not statistically significant?

      While there is a trend towards a very modest increase in phosphorylated p38 in the constitutively-active atfs-1 mutant compared to wild-type, quantification of four biological replicates indicated that the difference is not significant. This result is consistent with the fact that the levels of phosphorylated p38 are not significantly increased in nuo-6 or isp-1 mutants, both of which show activation of ATSF-1. We have provided raw images of all of these Western blots in our supplementals. In addition, we will repeat these Western blots to determine if this difference becomes significant with additional replicates.

      In Fig. 6B, the atfs-1 loss-of-function single mutant also increases the expression of Y9C9A.8, but suppresses it in a nuo-6 mutant background? What might that mean?

      It is possible that in wild-type animals disruption of atfs-1 causes a compensatory upregulation of specific stress response genes. We have previously shown that deletion of atfs-1 results in upregulation of chaperone genes involved in the cytoplasmic unfolded protein response (hsp-16.11, hsp-16.2; Wu et al. 2018; BMC Biology). Perhaps Y9C9A.8 is acting in a similar way. In nuo-6, the upregulation of Y9C9A.8 is driven by activation of ATFS-1, and thus is prevented by atfs-1 deletion. We will add these points to the manuscript.

      Reviewer #3

      1. Some studies propose that OP50 offers some toxicity to worms which is not observed in other bacterial strains like HT115. The authors should test the role of the p38-innate immune signaling pathway in nuo-6 and isp-1 lifespan using other non-pathogenic E. coli strains.**

      To determine if the effect of disrupting the p38-mediated innate immune signaling pathway on the lifespan of isp-1 and nuo-6 mutants was simply the result of losing protection against OP50 bacteria, we examined the effect of nsy-1, sek-1 and atf-7(gof) mutations on isp-1 and nuo-6 lifespan using non- proliferating bacteria. We found that even when no proliferating bacteria are present, disruption of the p38-mediated innate immune signaling pathway markedly decreases isp-1 and nuo-6 lifespan. This suggests that the p38-mediated innate immune signaling pathway is required for their long lifespan independently of its ability to protect against bacterial infection. Similarly, we have previously shown that lifespan extension resulting from dietary restriction is dependent on the p38-mediated innate immune signaling pathway even when non-proliferating bacteria are used (Wu et al. 2019, Cell Metabolism). We will clarify this important point in the manuscript.

      1. The authors should measure food intake in worms exposed to pathogenic bacteria, given that reduced bacterial intake may be related to reduced mortality.

      Unfortunately, it is not feasible to perform the food intake assay using the pathogenic bacteria because the bacteria cause death thereby complicating the calculation of food consumed per worm (which requires at least 3 days to assess). As an alternative to measuring food intake, we will attempt to measure intestinal accumulation of P. aeruginosa, which is a balance between food intake and other factors. To do this we will use a P. aeruginosa strain that expresses GFP and quantify the amount of intestinal fluorescence in wild-type, isp-1 and nuo-6 worms that have been grown on the GFP-labelled P. aeruginosa.

      1. The authors should check if ROS is required for the activation of the p38-mediated innate immune signaling pathway and reduction in food intake.

      To determine if the elevated ROS that is present in isp-1 and nuo-6 worms affects activation of the p38- mediated innate immune signaling pathway, we will treat wild-type, isp-1 and nuo-6 worms with Vitamin C and measure the ratio of phosphorylated p38 to total p38 by Western blotting. Similarly, to examine the effect of ROS on food intake, we will treat wild-type, isp-1 and nuo-6 worms with Vitamin C and then quantify its effect on food intake. For these experiments, we will use 10 mM Vitamin C as we have previously shown that this concentration is effective at reducing ROS in isp-1 worms to decrease isp-1 lifespan (Van Raamsdonk and Hekimi 2012, PNAS).

      1. Since ATFS-1 and the p38 pathway control food intake, how related to dietary restriction the phenotypes the authors are studying are?

      While the lifespan extension that results from mild impairment of mitochondrial function and the lifespan extension resulting from dietary restriction are both dependent on the p38-mediated innate immune signaling pathway, these interventions modulate innate immunity gene expression in opposite directions. We previously reported that dietary restriction primarily downregulates innate immunity genes (Wu et al. 2019 Cell Metabolism). Here, we show that mutations in isp-1 or nuo-6 primarily result in upregulation of innate immunity genes. To more globally examine gene expression changes between dietary restriction and mild impairment of mitochondrial function, we compared differentially expressed genes. We found that there was very little overlap of either upregulated or downregulated genes between dietary restriction and isp-1/nuo-6 mutants. We will add a supplementary figure to demonstrate this, and add these points to our manuscript.

      1. Somewhat related to the previous points, I am not so sure whether the changes in food intake are cause or consequence of the alterations in the innate immunity-related genes. Reduced food intake is depicted in Fig. 8 as the cause of the activation of the p38 pathway, but there is not enough evidence to unequivocally prove that. In fact, food intake might be controlled by the p38 or ATFS-1 pathway or by a common regulator such as ROS.

      We apologize that we didn’t make this clearer. In our previous work, we showed that dietary restriction results in decreased activation of the p38 pathway (Wu et al. 2019, Cell Metabolism). Here, we show that activation of ATFS-1 results in decreased food intake. Based on our previous study, this decrease in food intake should similarly decrease p38 pathway activation. In Figure 8, we have depicted ATFS-1 inhibiting food intake, and food intake activating the p38-mediated innate immune signaling pathway. Combined, our model suggests that activation of ATFS-1 should act to decrease p38-mediated innate immune signaling. We will clarify this in the figure legend.

      1. I am not so convinced of the role of DAF-16. In fact, in Fig. 5A daf-16 mutation reduces pathogen resistance and that could represent a toxic effect of the mutation. Furthermore, the results in Fig. 4D do not exclude the possibility that daf-16 and isp-1 act in parallel.

      We agree that the role of DAF-16 could be non-specific. While we show that disruption of daf-16 leads to decreased bacterial pathogen survival in isp-1 worms, it also decreases bacterial pathogen survival in wild-type worms. Since DAF-16 is known to be required for general resistance to stress, the decreased survival when daf-16 is disrupted could be due to a general toxic effect of reducing general stress resistance. This conclusion is consistent with our observation that DAF-16 is not involved in the upregulation of innate immunity genes in isp-1 worms. We will emphasize these points in our manuscript.

      1. Loss of innate immunity related genes may result in toxicity and sensitize worms to pathogenic bacteria. This is further supported by an even lower resistance to pathogens in the double mutants mainly in Fig. 2D.

      We agree. Our data confirms that disruption of the p38-mediated innate immune signaling pathway makes worms more susceptible to bacterial pathogens. We will emphasize this point.

      1. The blots are saturated, particularly in Fig. 4A, and this can be masking the differences in p38 phosphorylation. In fact, the fact that p38 phosphorylation is not changed is contradictory to the other results. How is p38 regulated by mitochondrial mutations then? I am concerned that p38 is actually not altered and the changes in gene expression are exclusively due to ATFS-1. The interaction with the p38 pathway demonstrated genetically could be due to the toxicity elicited by the loss of function mutations in this pathway.

      To address this concern, we will repeat the Western blotting experiment to compare the ratio of phosphorylated p38 to total p38 between wild-type, isp-1 and nuo-6 worms. We will take multiple exposures to ensure that the blots are not over-saturated. Having already completed four replicates, we believe that there is not a major change in p38 activation. Our data suggests that the p38-mediated innate immunity pathway is playing a permissive role such that it is required for baseline expression of innate immunity genes, but that activation of ATFS-1 is driving the enhanced expression of innate immunity genes that we observe in the long-lived mitochondrial mutants and constitutively active atfs-1 mutants. We will update our manuscript to clarify this.

      Minor concerns

      1. Lines 167 and 174: What are these p values referred to?**

      The p-values indicate the significance of the overlap between the two gene sets. Given the size of the two gene sets, this is the probability that the observed number of overlapping genes would result by picking genes at random. We will clarify this in the manuscript.

      1. Line 258: I partially agree with the conclusions, since the functions may not necessarily be associated with innate immune signaling but rather other functions of p38.

      Since isp-1 and nuo-6 worms have extended longevity even when grown on non-proliferating bacteria this indicates that their long life is not dependent on their enhanced resistance to bacterial pathogens. Similarly, since disruption of genes in the p38-mediated innate immune signaling pathway decrease isp- 1 and nuo-6 lifespan even when the worms are grown on non-proliferating bacteria, this suggests that this pathway enhances longevity independently of its ability to increase innate immunity.

      1. Why in figures 4D and E different mutants were used?

      We only used isp-1 mutants to examine the effect of daf-16 because we were unable to generate nuo- 6;daf-16 mutants due to close proximity of the two genes on the same chromosome. We only used nuo- 6 mutants to examine the effect of atfs-1 because isp-1;atfs-1 worms arrest during development. We will include this explanation in our manuscript.

      1. Line 498: revise writing.

      We will rewrite this sentence to improve clarity.

      1. Show blots in Fig. 7B.

      We will provide an image of a representative Western blot in Figure 7, and will provide the raw images for all of Western blots in our supplementals.

      1. It would be interesting to know where the activation of the immune-related genes by the mitochondrial mutations is happening, whether this is a cell autonomous or cell non-autonomous mechanism.

      While it would be interesting to explore whether specific tissues are important in sensing mitochondrial impairment in order to upregulate genes involved in innate immunity, it is beyond the scope of this manuscript. Previous work has shown that knocking down the expression of the cytochrome c oxidase gene cco-1 in neurons can activate the ATFS-1 target gene hsp-6 in the intestine (Durieux et al., 2011). Based on this, one could hypothesize that a similar cell non-autonomous mechanism might be involved. We will note this possible future direction in our discussion.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Campos et al provide evidence that mild mitochondrial dysfunction in C. elegans induces genes involved in innate immunity and promotes bacterial pathogen resistance and longevity, while inhibits food intake through an ATFS-1-mediated mechanism. The manuscript is well-written and the experiments are well-performed and reported. However, there are several points that need to be addressed before the manuscript can be published.

      Major concerns

      1. Some studies propose that OP50 offers some toxicity to worms which is not observed in other bacterial strains like HT115. The authors should test the role of the p38-innate immune signaling pathway in nuo-6 and isp-1 lifespan using other non-pathogenic E. coli strains.
      2. The authors should measure food intake in worms exposed to pathogenic bacteria, given that reduced bacterial intake may be related to reduced mortality.
      3. The authors should check if ROS is required for the activation of the p38-mediated innate immune signaling pathway and reduction in food intake.
      4. Since ATFS-1 and the p38 pathway control food intake, how related to dietary restriction the phenotypes the authors are studying are?
      5. Somewhat related to the previous points, I am not so sure whether the changes in food intake are cause or consequence of the alterations in the innate immunity-related genes. Reduced food intake is depicted in Fig. 8 as the cause of the activation of the p38 pathway, but there is not enough evidence to unequivocally prove that. In fact, food intake might be controlled by the p38 or ATFS-1 pathway or by a common regulator such as ROS.
      6. I am not so convinced of the role of DAF-16. In fact, in Fig. 5A daf-16 mutation reduces pathogen resistance and that could represent a toxic effect of the mutation. Furthermore, the results in Fig. 4D do not exclude the possibility that daf-16 and isp-1 act in parallel.
      7. Loss of innate immunity related genes may result in toxicity and sensitize worms to pathogenic bacteria. This is further supported by an even lower resistance to pathogens in the double mutants mainly in Fig. 2D.
      8. The blots are saturated, particularly in Fig. 4A, and this can be masking the differences in p38 phosphorylation. In fact, the fact that p38 phosphorylation is not changed is contradictory to the other results. How is p38 regulated by mitochondrial mutations then? I am concerned that p38 is actually not altered and the changes in gene expression are exclusively due to ATFS-1. The interaction with the p38 pathway demonstrated genetically could be due to the toxicity elicited by the loss of function mutations in this pathway.

      Minor concerns

      1. Lines 167 and 174: What are these p values referred to?
      2. Line 258: I partially agree with the conclusions, since the functions may not necessarily be associated with innate immune signaling but rather other functions of p38.
      3. Why in figures 4D and E different mutants were used?
      4. Line 498: revise writing.
      5. Show blots in Fig. 7B.
      6. It would be interesting to know where the activation of the immune-related genes by the mitochondrial mutations is happening, whether this is a cell autonomous or cell non-autonomous mechanism.

      Significance

      This study provides significant advance in mechanistic aspects of lifespan regulation in worms, linking mitochondrial metabolism, food intake, innate immunity, resistance to pathogen infections and longevity. The work presents novel mechanistic insights that could be applied to understand how mild mitochondrial dysfunction leads to increased lifespan. Overall, the audience interested in this study are expected to be aging biologists and possibly immunologists with particular interest in mechanistic aspects of longevity and innate immunity, as well as C. elegans as a model organism. I am part of this group of scientists with particular interest in studying the interplay between metabolism and aging.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary:

      Campos et al. show that mild mitochondrial impairment promotes C. elegans resistance against the bacterial pathogen Pseudomonas aeruginosa PA14, which is associated with increased expression of a subset of innate immunity genes in the animal. Interestingly, upregulation of the innate immunity genes in the mitochondrial electron transport chain mutants, nuo-6 (complex I) and isp-1 (complex III), does not appear to involve enhanced activation of the p38 MAPK PMK-1, which has been previously implicated in anti-bacterial immunity (Jeong et al, EMBO J 2017, 36, 1046). Because the authors also show that this increased pathogen resistance and expression of innate immunity genes in at least one of the mitochondrial mutants (nuo-6) only partly depend on the p38 PMK-1 pathway, this would argue for the involvement of another pathway. The authors show that this other pathway involves the mitochondrial unfolded protein response (mitoUPR) through activation of the transcription factor atfs-1, which not only upregulates a subset of innate immunity genes, but also presumably decreases pathogen intake. Together their data suggest that the p38 PMK-1 pathway and mitoUPR act in parallel to promote the enhanced pathogen resistance of mitochondrial mutants.

      Moreover, while they show that the FOXO transcription factor daf-16 is also required for the enhanced pathogen resistance of mitochondrial mutants (i.e,, isp-1), they rule out daf-16 involvement in the activation of innate immunity genes. Instead, daf-16 decreases pathogen intake and upregulates other stress-response genes. Thus, this study highlights the requirement for multiple pathways to promote pathogen resistance through multiple mechanisms.

      Major comments:

      (1) The authors state that the p38 MAPK PMK-1 is not activated in the long-lived mitochondrial mutants. However, it might be better to state that there is "no enhanced activation" of PMK-1, since they clearly show in nuo-6 and isp-1 mutants the presence of phosphorylated PMK-1 (Fig. 4A), which would indicate an activated form of PMK-1 in these mutants.

      (2) Are the food-intake behaviors of all mutants in liquid culture (Fig. 4B-F) the same as their food-intake behaviors on solid agar media, the environment where pathogen resistance was measured?

      (3) Does the p38 pathway single mutant nsy-1 or sek-1 live shorter than wild type on dead E. coli OP50 (Fig. S9) than they do on live OP50 (Fig. 3)? If so, what might that mean? These mutants are also living shorter than wild type on PA14 (Fig. 2), but live as long as wild type on OP50 (Fig. 3). What is in the live OP50 that allows these mutants to live like wild type?

      At the same time, wouldn't it be simpler to call the multiple antibiotic-treated OP50 as "dead bacteria", instead of "non-proliferating bacteria"? Some of the antibiotics used to treat OP50 are bactericidal and not bacteriostatic.

      (4) Since nuo-6 and isp-1 do not always behave exactly the same in their dependence on certain genes (e.g., Fig. 2C vs Fig 2D), what happens in isp-1; atfs-1 double mutants? Do these mutants behave in the same manner as nuo-6; atfs-1?

      Regarding nuo-6; atfs-1, why does the double mutant live shorter on PA14 than either single mutant (Fig. 6A)? Is this because atfs-1 is needed to activate the p38 MAPK-dependent and -independent pathways? In Fig. 7B, the atfs-1(gof) appears to have slightly more phosphorylated p38 compared to wild type, although it is not statistically significant?

      In Fig. 6B, the atfs-1 loss-of-function single mutant also increases the expression of Y9C9A.8, but suppresses it in a nuo-6 mutant background? What might that mean?

      Some of my comments can be easily addressed with written comments. Others might require generation of a strain, like the isp-1; atfs-1 double mutant, prior to any assays.

      Significance

      Please see the above summary for the significance of this manuscript to the field. Importantly, this study highlights the requirement for multiple pathways to promote pathogen resistance through multiple mechanisms. Readers interested in aging, mitochondrial function, innate immunity and stress responses should find this study thought-provoking. I include myself in this group of readers, since I study the genetics of C. elegans aging and stress responses.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      The manuscript by Campos et al. describe the association between long-lived mitochondrial mutants and increased resistance to pathogen infection. The authors discover that mitochondrial electron transport chain mutants (nuo-6 and isp-1) display increased expression of many genes involved in innate immunity that are regulated by the p38 signaling pathway. Consistent with this finding, mito mutants displayed increased survival during infection. p38 signaling was found to be required for these innate immune gene inductions during mitochondrial stress and for their increased survival during infection. P38 signaling was also found to be required for the increased lifespan of isp-1 and nuo-6 mutant animals. Intriguingly, p38 signaling does not appear to be affected in these mitochondrial mutants, despite being required for the increase in immunity/host resistance. The authors discover that mitochondrial stress animals exhibit reduced feeding which they argue may suppress any activation of the p38 pathway caused by ROS. The mitochondrial UPR was also found to be required for the increase in innate immune gene expression in isp-1 and nuo-6 mutant animals, as well as their extended survival. The authors conclude that ATFS-1 can act in parallel to p38 signaling by directly binding to common innate immune target genes. In support of this, ATFS-1 and ATF-7 appear to bind to shared target genes but likely at independent sites due to their different consensus sequences.

      1. One general consideration is that some of the key concepts outlined in this manuscript have already been described previously and are therefore not entirely novel conceptually. For example, one key citation missing from the current manuscript is from Hwang et al. 2014 (PMID 25288734). This study has already described that the isp-1 mutant strain survives longer during P. aeruginosa infection. This citation also describes that the gene expression profile of isp-1 mutants animals includes a considerable number of pathogen-responsive genes that are similarly induced during infection. While the current manuscript does go into the mechanism of this resistance with more detail, they should amend the language to more appropriately reflect previous work, notably the above reference.
      2. The authors suggest that ROS activation of the p38 MAPK pathway is likely not the mechanism that explains the resistance of long-lived mitochondrial mutant animals due to their reduced food intake. However, is ROS production nonetheless involved? Does antioxidant treatment suppress the increased resistance during infection of isp-1 and/or nuo-6 mutant animals?
      3. (line 278-282): the authors should elaborate on how the p38 MAPK pathway plays a permissive role. It is intriguing that ATFS-1 and ATF-7 are both bZIP transcription factors that could theoretically heterodimerize and that they share common immune gene targets. The authors do indicate that the binding sites for ATFS-1 and ATF-7 are very different and are likely acting distinctly but some speculation would nonetheless strengthen this statement.
      4. The authors suggest that reduced food consumption of nuo-6 and isp-1 animals may suppress ROS-induced activation of the p38 innate immune pathway. It is intriguing that dietary restriction was previously shown to increase resistance to infection, presumably through p38-independent mechanisms (PMID 30905669). It would be interesting to measure host survival of nuo-6 and isp-1 mutant animals that are dietary-restricted to see if the enhanced survival rates conferred by mitochondrial stress and DR are additive or not.
      5. Figure 2: It is intriguing that loss of p38 signaling appears to have different effects in nuo-6 versus isp-1 animals. Specifically, loss of p38 signaling in isp-1 mutants renders them more sensitive to infection than wild-type, whereas it generally suppresses survival rates back to wild-type levels in the nuo-6 mutant background. Even within the nuo-6 mutant group, loss of SEK-1 has more dramatic effects on nuo-6 mutant animals than does loss of NSY-1, PMK-1 or ATF-7(gf). This is despite the fact that the nsy-1, sek-1, and pmk-1 alleles that are used in this study are all reported to be null. Can the authors speculate on these differences?
      6. One of the main conclusions from this study is that ATFS-1 likely binds directly to innate immune genes that are in common with ATF-7. Since this is such a pivotal finding, the authors should validate some candidate genes from the referenced ChIP seq datasets using ChIP qPCR. Also, are there predicted ATFS-1 binding sites (PMID 25773600) in these promoters?

      Significance

      As mentioned in my comments, some of the findings of the current manuscript have been shown before. Nonetheless, the authors do describe new insights into the mechanism of how mitochondrial stress signaling promotes host resistance to infection, which is noteworthy.

      This manuscript would be of value to researchers in the fields of mitochondrial biology, mitochondrial stress signaling (including the UPRmt field), host-pathogen interactions, and longevity determination.

      My expertise is in stress signaling in the context of longevity and host-pathogen interactions.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reviewer #1:

      This paper shows that transient genetic induction of the IMD innate immune pathway during Drosophila development, has long term effects on adult health and lifespan. The paper is well-written, the experiments are well designed and executed, and the data are without exception good quality. The data also support the specific conclusions well. The experiments take full advantage of the Drosophila system to pinpoint the effect on lifespan to long term activation of inflammation in the gut, which is interlinked and dependent upon changes in the microbiota. However the analysis is not comprehensive, because neural-specific effects on starvation resistance are not followed up, and because the etiology of the changes in microbiota is not mapped out. I should also say that I do not fully agree with the conclusion in the last sentence of the Abstract (the most important general conclusion), that the study "demonstrates a tissue-specific programming effect" of early transient IMD function. Since the lifespan shortening was shown to be dependent upon increased gut Gluconabacter, I would not call this "programming" (though the term is vague enough to mean most anything.) Instead, I would refer to the effect as a host-environment interaction. If it were "programming" of, for instance, the genetic or epigenetic sort, it would not be so easy to reverse.

      Response1-1: We thank the reviewer for the fair evaluation of the manuscript. We agreed with the point of "programming" effect: it might be a bit overstatement. We would like to make our conclusion modest and avoid the ambiguous word in the last sentence of the abstract.

      A few other minor comments:

      1. Several experiments, the authors use GFP (Fig S1) or the IMD targets DptA or Dro (Fig S2) to validate the induction of IMD-CA. Why have they not directly measured the expression of IMD-CA. This would seem to be logical and technically easy, by qPCR.

      Response1-2: We will perform qPCR of Imd gene.

      1. In Fig 4 we see and experiment in which animals were "supplemented" with Alkaline Phosphatase, a protein. How was this done and why does it work? Is AP a gut luminal protein?

      Response1-3: It is a luminal protein and thus ingestion of the protein works just as endogenous one. This is also proved in the literature (Kühn F et al., JCI insight, 2020). The protein targets, for example, peptidoglycan to attenuate its immuno-stimulative capacity. We will add the explanation in the text.

      1. The results in Fig 5 are really where the paper begins to determine a mechanism for the lifespan shortening. However, these results are rather weak, and they don't extend very far. The increase in Gluconobacter is mild (Fig 5C), and is not clear in the 16S rRNA sequencing experiment (Fig 5A). Furthermore, it is not clear that Glunconobacter specifically is the source of the lifespan shortening, of just bacteria in general (Fig 5E).

      Response1-4: Why we are focusing on this bacterial genus is because we have already shown in our previous paper that increase of Gluconobacter shortens organismal lifespan (Kosakamoto H et al., Cell Reports, 2020). We also reported that Gluconobacter is increased in response to the (necrosis-induced) immune activation, the situation of which is strikingly similar in the larval IMD activation in the present study. As we proved before, we wanted to perform the gnotobiotic/monoassociation experiment here to show sufficiency of the bacterium for the lifespan-shortening phenotype, however preliminary experiments implied that combining Germ-free with the GeneSwitch system is technically difficult as it caused higher lethality. This might be because the drug RU486 shows a different bioavailability/ dynamics in the GF flies.

      Significance:

      Although this paper addresses in interesting topic using an elegant and effective experimental strategy, the final results (Fig 5) and conclusions are modest. The analysis doesn't extend far enough to demonstrate how long term changes in microbiota arise from short term developmental changes in innate immune activity. Moreover, there is no detailed data concerning how the altered microbiota alter lifespan. Thus, while the results are interesting and the findings open avenues for further studies on the topic, the significance of the paper is modest, in its current state. Further analysis of how the microbiota is permanently changed, and why this affects lifespan, could enhance the paper. However, it is not clear that any simple, quick experiments could dramatically advance the findings from where they are now.

      Response1-5: We would like to add the data that IMD activation in the larvae increased the Gluconobacter already in the larval gut. This data mechanistically suggests that microbiome alteration in the larval gut persists into adulthood, demonstrating how larval immune signalling influences adult immune activity. This data should strengthen a concept that even a transient and mild immune activation in juvenile stage can mess up the microbiota and permanently trigger the inflammatory pathology.

      Reviewer #2:

      In this manuscript, the authors study the impact of ubiquitously activating the IMD pathway only during larval stages on subsequent adult life. They report a shortened lifespan due to IMD pathway activation in the larval gut and a resistance to starvation linked to its activation in the nervous system. While there is apparently no activation of the IMD pathway in very young adult flies, the expression of some IMD-dependent antimicrobial peptide (AMP) genes is reported from 7-10 flies onwards. This expression is lost upon treating the adults with antibiotics, which also rescues the shortened lifespan phenotype. It correlates with a possible increase in the proportion of Gluconobacter in the microbiota.

      While the study looks interesting, it is not clear whether the results, especially those of survival studies and RTqPCR experiments, have been replicated in independent experiments. This is essential to warrant their conclusions. In this respect, this reviewer notes some important variability in the lifespan studies (e.g., Fig. 2B vs. Fig. 4E): how do the authors account for a lifespan that is shortened almost by half in Fig. 4E? Also, Fig. S2B is not convincing given the observed variability. More data points are required to reach a conclusion.

      Response2-1: We would like to mention that all experiments have been replicated at least twice. We admit that the phenotypes of larval IMD activation such as lifespan shortening effect and inflammatory response in adult gut are indeed quite variable, empirically depending on seasons. This is not surprising to us since many immune-metabolic phenotypes as well as lifespan of the flies are variable between seasons. We assume that this would imply that the effect is through gut microbiota. In Japan, we have a typical seasonal change in the temperature/ humidity that greatly influences gut microbial situation, even though we use an incubator which allows constant temperature/humidity setting. It is therefore we need to carefully compare the phenotype of flies in the same experiment, and this is where the GeneSwitch works effectively.

      Regarding Fig. S2B, we could increase the number of samples in Fig. S2B in new experiment.

      The authors suggest in their Discussion some kind of epigenetic mechanism transmitting the information of IMD pathway activation having occurred at larval stages. Whether this depends on a change of metabolism remains to be demonstrated, in as much it is likely that there is a major metabolic "reset" occurring during metamorphosis to prepare the individual to the new environmental conditions encountered as an adult. It is also likely that larvae in the wild grow in a microbe-rich slurry and are likely to experience intestinal infections. As noted by the authors themselves on the top paragraph of p7 (line numbers are unreadable), the larval gut is degenerated during metamorphosis and thus the enterocytes that have produced AMPs are no longer present. One possibility would be that there is an early dysbiosis already occurring during larval stages and that the young adults re-infect themselves, for instance through contact with the meconium. The authors' experiments with antibiotics are the key to this study. However, one would like to observe results of the converse experiment, that is, treating larvae with antibiotics (a better control would be to bleach the embryos to generate axenic flies) and then raising the hatched adult flies in a conventional manner. In this way, the authors may determine whether the influence of early IMD pathway activation occurs through "self" mechanisms or whether it entails a contribution from the microbiota. It might also be useful to use reporter transgenes such as Dpt-LacZ to document where in the gut IMD activation takes place in the adult and to monitor whether there is any weak signal that would not be picked up by RTqPCR in newly hatched flies.

      Response2-2: We highly appreciate the reviewer for pointing out this important caution. We now checked the dysbiosis in the larval gut (by qPCR of Gluconobacter) and found that it is increased already. For detail, please see our response1-4/1-5. This would strikingly improve the study.

      Regarding monitoring IMD activation by the reporter, we plan to do this experiment in our next project. Obviously, a remained question is how epigenetic mechanism in a particular cell/locus mediates the phenotype. This is our next goal and thus lies beyond the scope of this paper.

      Specific comments

      1. The GS system used in this study requires multiple controls, as a study from the Serroude laboratory has reported a driver-dependent leakiness of expression independent of exposure to RU486 (Poirier et al., Aging Cell, 2008). Thus, it would be good to check this with a cross to a UAS-GFP driver and examining the 10 and 40-day time points. The same should be done with antibiotics-treated flies as regards DptA and Drosocin expression (Fig. 5C &D: the age of the adult flies is not specified; it would also be positive to examine the distribution of Acetobacter and Gluconobacter at 10 and 40 days).

      Response2-3: We believe the backcrossed UAS-LacZ would be suitable as a control. For key experiments, we checked that RU486's side effect and confirmed it was not the case. What we have not been confident in this respect is the gut microbiota, and therefore we would test whether Gluconobacter is increased just by RU486 or not. Regarding Fig. 5C&D, we used young (day 10-old) flies. We did not examine the Aceto/Gluco at older days, but we assume that they are still in the gut microbiota. How ageing involves microbial change in this and many other contexts is our ongoing project.

      1. The authors state at the bottom of p6 that JAK-STAT-dependent AMP expression was detected. Fig. 4C shows a significant expression of Drsl2. As far as this reviewer recalls, Buchon et al. had demonstrated a dependence on the JAK-STAT pathway of Drsl3. It would also be worth looking at Turandot genes. As regards an involvement of the Toll pathway, it is not clear whether Drosomycin is significantly expressed as it shows a 32-fold increase in Fig. 4C, yet is not found in Table S2. This issue should be clarified using RTqPCR and it may be worth monitoring also the expression of BomS1.

      Response2-4: We would like to add the qRT-PCR of TotA,C, Drs, and BomS1 in the revised manuscript.

      Minor points

      a) It is surprising to observe an expression driven by the TIGS2 transgene in the larval fat body as it appears to be solely expressed in the intestine in adults. In which epithelial cell type of the intestine is TIGS2 expressed?

      Response2-5: We were also surprised (and disappointed indeed) by the fact that TIGS2 shows broader expression pattern in the larvae. As far as we observed, it expresses at least in the enterocytes (strongly in anterior midgut).

      b) The authors have carefully defined an optimal dose of RU486 at 1 µM. Why use 20µM Fig. S1, or 50µM (Fig. S6)? Of note, the Flygutseq indicates that Alp9&10 are downregulated in enterocytes upon P. entomophila challenge.

      Response2-6: We used 1µM at first, only to have realised that 1µM is too mild to carefully assess the expression pattern of the driver. Thank you for the note, we would cite the paper to generalise our finding.

      c) Fig. 1B&C: are the flies used in C) escapers as hardly any flies survive the 5µM RU486 challenge B)?

      Response2-7: We prepared more than 1000 embryos for this and many other experiments. One percent of survivors is enough to produce flies in Fig. 1C.

      d) Fig. 1D: do the authors know why there is such a difference between DptA and Drosocin?

      Response2-8: We greatly appreciate for this comment. There seemed to be a miscalculation here. We have repeated the same experiment again, and now they showed similar magnitude of induction. We would revise this figure.

      e) Fig. 2E: the caption does not allow to recognize which curve is LacZ RU and which one is IMD[CA] (dashed line?).

      Response2-9: We would amend the caption.

      f) Methods: the authors mention that they have dissected crop and Malpighian tubules. As no crop data are reported, does it mean that the crop and MT have been pooled in the same sample; please, clarify.

      Response2-10: Sorry for our confusing writing. We have revised the text now to clarify we have "removed" crop and MTs.

      Significance:

      This study takes place in a context of the influence of infections during early life on subsequent fitness at the adult stage of organisms. With respect to mammals, it is important to note that Drosophila melanogaster undergoes a full metamorphosis that yields a thoroughly novel life form adapted to a new aerial life style. Thus, an influence of the larval stage on the imago is definitely interesting. The senior author has already published interesting work on this topic by showing that oxidative stress experienced during larval stages modifies adult fitness through an indirect action on the larval microbiota. This work is going to be of interest to investigators working on the microbiota and also on intestinal infections, let alone the community of entomologists.

      Response2-11: We are really happy to see this comment. We believe that it is important to provide evidence and elucidate mechanisms of how gut microbiota alteration acts as a key factor to exert a life-long effect on the host physiology by a transient event occurred at a juvenile stage.

      Drosophila host defense against infections, intestinal infections, host-pathogen interactions

      Reviewer #3

      Summary

      In their manuscript "Activation of innate immune signalling during development predisposes to inflammatory intestine and shortened lifespan" Yamashita et al. have used the Gene Switch system to temporally overexpress imd in Drosophila larval stages and followed the possible effect on adult food intake, starvation resistance and lifespan. Specifically, the authors show that activating the IMD pathway in Drosophila larvae leads to decreased lifespan, lower adult body weight and lower food intake. Furthermore, the authors claim that adult flies develop inflammation in the gut, and, as a consequence, a change in the gut microbiome. The study aims to show the effect of prolonged immune system activation at an early developmental stage on adults.

      Major comments

      The authors' main conclusion is that IMD activation during development results in adult inflammatory gut, which affects the lifespan of the flies as well as food intake and starvation resistance. Mifepristone (RU486) is used to induce gene expression under GeneSwitch drivers. Using mifepristone is a bit controversial when lifespan effects are being studied. The authors should state that there are various earlier studies showing that mifepristone affects lifespan and also metabolism (e.g. reduces mitochondrial functions and activates AMPK). Although it is fairly reliable that the effects that the authors are seeing are resulting from the IMD pathway activation, it can also be a stress response caused by a combination of mifepristone treatment + IMD activation.

      Response3-1: We would like to carefully discuss this possibility by citing the relevant literature.

      The authors show that mifeprestone concentration of 5 µM is causing severe lethality and low body weight in DaGS>IMDCA animals. The concentration of 1 µM doesn't give the same effect, but already induces gene expression (as confirmed by imaging in Fig. S1B). Throughout the study, the concentration of 5 µM is still used and the authors claim that the phenotype seen in DaGS>IMDCA animals is suggesting that IMD activation impairs larval growth. However, can this be a case of toxicity/synthetic lethality caused by high concentration of RU486? Why wasn't 1 µM concentration used for the experiments, if it's sufficient to induce gene expression? Is there a possibility of using another temporal induction method causing less stress/toxicity for the flies? Furthermore, authors show that 1 µM mifepristone treatment shortens female lifespan, which is contradictory to the earlier literature. Citations are needed in here. Also, the decrease in female lifespan looks like it is non-significant, what statistics were used in this analysis? The methods section says OASIS2 software was used, but no further details are provided.

      Response3-2: We apologise our unclear writing. We used 1 µM throughout the study, not 5 µM to avoid the drug's toxicity. We have not tested other method as GS works well by carefully optimising the RU486 doses. For statistics of lifespan, we would like to add the detailed information in the method section.

      Only under 10% of in DaGS>IMDCA flies exposed to 5 µM RU486 eclose, yet in Fig. 1C showing the results of body weight measurements, n=20-50. How were the DaGS>IMDCA flies obtained if under the experimental conditions only a few of them develop successfully? At which developmental stage do the flies die? Why were only male flies used for this experiment?

      Response3-3: Please see our Response2-7 We did not carefully check the developmental stage, but it apparently died at early stages of the larva. We usually use male flies for body weight, as female's body weight is understandably affected by the number of eggs inside of the body, making it difficult to discuss the phenotype of developmental growth.

      More evidence is needed before concluding that the IMD lifespan effect is coming from the inflammatory intestine. TIGS driver is used to express genes of interest in the gut and fat body. No specific drivers for only the gut or only the fat body are used. Can it be claimed that the effect seen is coming purely from the gut expression? Is it possible that the fat body, which is the main organ responsible for the AMP production is actually responsible for enhanced IMD pathway target AMPs expression (as shown in Fig. S2A; the fold change is higher in the gut that in the fat body)? Was the gut not inflamed or damaged in larvae as there were no upd3 expression?

      Response3-4: Thank you for raising this important point. Indeed, we have tried to seek for larval gut- (or fat body)-specific GeneSwitch but no drivers were suitable unfortunately. We admit that our conclusion is not thoroughly backed by the data, so we would carefully discuss this in the revised manuscript. Nevertheless, our new data showing dysbiosis in the larval gut now indicates that this is where the irreversible phenotype resides.

      If the authors want to state that the effect is coming from inflammatory gut and that the lifespan effect and feeding/starvation resistance effect is coming from other tissues, why did the authors still decide to use the daughterless driver to study the IMD effect on lifespan, rather than gut or fat body driver, especially if they show that the feeding rate is changed (IMD OE in neurons) as this can also affect the microbiota (which they state is because of inflammatory gut)?

      Response3-5: We used DaGS driver simply because it was stronger in terms of the lifespan phenotype. One can assume that the decreased feeding of the DaGS>IMDCA flies might influence the increased Gluconobacter, inflammatory gut, and the shortened lifespan. However, these phenotypes were going to the opposite direction, as decreased feeding theoretically leads to decrease the gut bacteria and extend lifespan. We would like to use a gut-specific (or even cell-type specific) GeneSwitch driver for further mechanistic study, but it may take a huge effort. Our take-home message of the present study is that the juvenile-restricted inflammatory experience causes early dysbiosis, which trigger persistent inflammatory gut in adult, and thereby shortens lifespan. We believe this is adequately supported by the data.

      Immune responses are costly and that's one reason why their negative control is so important. The authors could state possible effects between continuously activated immune system (IMD pathway in larvae) and trade-offs in size and life-span in adult flies (+ citations to related studies). The role of constitutively activated IMD in larvae could have been confirmed by using alternative method for activating IMD, e.g. knock out of a negative regulator. Additional controls could have been used, e.g. DaGS background strain without the daughterless driver crossed with the IMDCA , or in the experiment where the gut microbiota was checked (this experiment was lacking the DaGS >LacZ + mifepristone treatment and only had DaGS>IMDCA flies with and without the mifepristone treatment). Usually in Drosophila genetics more control crosses are needed, for e.g. two different constructs of the OE IMD strains e.g. GD and KK backgrounds. The efficiency of the IMD OE could have been directly measured with qPCR and not only shown by measuring the expression of target AMPs.

      Response3-6: We would like to make sure the point clearer. The phenotype observed in our study is not related to the trade-off between size and lifespan since we used the 1µM of RU486, which did not affect body size (Fig. 1C) but did shorten the lifespan (by larval but not adult IMD activation). In this sense, we tried to avoid the strong immune activation in the larva as it disturbed the development. Regarding other method for activating IMD, we were not able to use knockouts because we need to make it temporal manipulation in larvae. Alternatively, we had tested PGRP-LC overexpession. When it was expressed strongly in the larvae, it led to the lethality. When it was mild, we observed the shortened lifespan just as in IMDCA overexpression. This new data would support our conclusion well. Please note that we use IMD OE not RNAi (GD and KK lines are RNAi lines).

      Regarding gut microbiota, we would like to check whether DaGS>LacZ + RU86 affects Gluconobacter or not. Regarding, efficiency of IMD OE, we would like to perform qPCR of IMD gene.

      One of the conclusions drawn is that adults develop gut tissue damage as a result of inflammation. The authors could provide further evidence of this by utilizing microscopy to recognize possible changes in gut epithelia (with appropriate controls).

      Response3-7: We appreciate for the suggestion. Somewhat intriguingly, we have not observed any difference in the number of ph3 positive cells, a hallmark of tissue damage-induced ISC proliferation. This is consistent with our preliminary observation that aged flies after larval IMD activation did not show "smurf" phenotype, an indicator of gut barrier dysfunction. In the revised manuscript, we would like to add some qPCR data to test whether upd3/JAK-STAT pathway is activated to detect the tissue damage and carefully discuss the point.

      The methods section could be more detailed and clearer to the reader. The statistical analyses used for e.g. survival rates should be described in more detail. The sustained alkaline phosphatase treatment should also be described in more detail, as currently the methods do not clearly state how long the flies were treated with Alp. The description of antibiotic cocktail treatment in the materials and methods should not be under the stocks and husbandry section, as it implies that all flies used were all the time maintained on an antibiotic cocktail<br> Methods sections could be arranged to resemble more the order of the results sections and more details should be added. It would be challenging to repeat the experiments the way as they have been described.

      Response3-8: We would like to amend the method section accordingly.

      Minor comments

      The efficiency of the IMD OE was not directly measured with qPCR, only the expression of target AMPs were measured. The authors should show the activation efficiency of the IMD expression.

      Response3-9: Please see our Response1-2

      Figure 1B, are these females or males?

      Response3-10: It includes both sexes. We add this explanation in the methods.

      Fig1 E. in the transcriptome analysis the negative control should have been also treated with mifepristone<br> Response3-11: Due to financial reason, we could not perform RNAseq analysis for all the samples. We believe showing specific activation of IMD pathway in the IMDCA + RU486 compared the negative control IMDCA -RU486 is sufficient.

      For the experiment presented in Fig. S6, females are used, although for the majority of other experiments, only male flies are used?

      Response3-12: We have done qPCR in males as well. We add this data in the revised manuscript.

      In Fig. S1C, DaGS>GFP expression is induced in 3rd instar larvae by 20 µM RU486. Is concentration this high not toxic for the larvae?

      Response3-13: In this experiment, we wanted to see the expression pattern of the driver. Please also see our Response2-6.

      The fact that developmental IMD activation increased DptA expression in the adult gut suggested that an irreversible change occurred in this tissue. - what is meant by irreversible change? Can this claim be made?

      Response3-14: What we meant by "irreversible" here was that there was a permanent increase of immune activity by the larval IMD activation. It would have been inappropriate to describe the phenotype, so we would avoid this word in the revised manuscript.

      Alp results are interesting. Does IAP expression decrease in adults as they age? The authors used "sustained Alp supplementation" to rescue the reduced lifespan phenotype in adults. How long were the flies treated with Alp? This should be mentioned in the materials and methods

      Response3-15: According to the literature, IAP expression is decreased during ageing in (Kühn F et al., JCI insight, 2020). In this experiment, we used life-long IAP supplementation (from day 2 onward). This would be mentioned in the revised manuscript.

      The description of antibiotic cocktail treatment in the materials and methods should not be under the stocks and husbandry section, as it implies that all flies used were all the time maintained on an antibiotic cocktail.<br> In the qRT-PCR section, the analysis method could be added (copy number method/ΔΔCt)<br> Line 49-50 is missing a reference<br> Line 81, PGAM5 is mentioned without further explaining what it is<br> Line 229 - what is meant by inflammatory vicious cycles?<br> Line 314 - what is meant by thrifty phenotype?<br> In figures showing lifespan, a different color code could be used where yellow and orange/red lines represent different genotypes/treatments; it is hard to visually distinguish the colors that are used at the moment<br> Figure legend for Fig. 4C - AP could be written out as alkaline phosphatase already here. Also in the legend for Fig. 4 it says E twice (instead of E and then F)<br> Fig. 5A - a title for x-axis could be added to make it clearer that this represents the proportion of the bacterial taxa in the gut<br> Fig. S2A - LacZ is mentioned in the description but not shown in the figure

      Response3-16: We would amend these in the revised manuscript.

      Were there possible cross tissue contaminations in the adult gut samples? where possible contaminations checked e.g. with fatbody specific primers? This should be checked as fatbody is known to produce more AMPs when immune activated, than the gut tissue.

      Response3-17: We are well-trained in the dissection of the gut. All the fat body was carefully removed by dissection. Especially when abdominal samples do not show any difference in Fig. 4B, we did not agree that the contamination would explain the data.

      CFU analysis: were the flies surface sterilized briefly in ethanol prior dissections?

      Response3-18: Yes, flies were surface sterilized by serial washes of 3% bleach and 70% ethanol. We add this procedure in the method section.

      Fig2 B-C, the differences between the females and males are not drastic enough to decide to use only males later on. E. typo in starvation. DA>IMD males have decreased starvation resistance without and with the mifepristone treatment?

      Response3-19: We decided to use males as females have a slight negative side effect of RU486. DaGS>IMDCA have increased starvation resistance only with the mifepristone treatment. We apologise that our figure caption is not clear. We would amend this in the revised manuscript.

      Significance

      The topic presented in this manuscript is interesting and relevant for both the fields of aging and immunology and partially explains why early life experiences are important for the wellbeing of the individual later in life. Some of the findings presented in the manuscript are novel, at the same time some of these same issues have been examined in papers related to immune priming/training/memory. The reported findings of the manuscript would be of interest for an audience that is interested about aging and lifespan related issues, as well as immunology and metabolism.

      Response3-19: This reviewer's evaluation of the significance of the study is very encouraging. We believe that the phenotypes observed in the manuscript would give wide interest to the biologist working on this hot topic: how early-life event induces later-life health.

      Field of expertise: Innate Immunity; Drosophila; Metabolism; Host-Pathogen Interactions; Biomedicine

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Summary

      In their manuscript "Activation of innate immune signalling during development predisposes to inflammatory intestine and shortened lifespan" Yamashita et al. have used the Gene Switch system to temporally overexpress imd in Drosophila larval stages and followed the possible effect on adult food intake, starvation resistance and lifespan. Specifically, the authors show that activating the IMD pathway in Drosophila larvae leads to decreased lifespan, lower adult body weight and lower food intake. Furthermore, the authors claim that adult flies develop inflammation in the gut, and, as a consequence, a change in the gut microbiome. The study aims to show the effect of prolonged immune system activation at an early developmental stage on adults.

      Major comments

      The authors' main conclusion is that IMD activation during development results in adult inflammatory gut, which affects the lifespan of the flies as well as food intake and starvation resistance. Mifepristone (RU486) is used to induce gene expression under GeneSwitch drivers. Using mifepristone is a bit controversial when lifespan effects are being studied. The authors should state that there are various earlier studies showing that mifepristone affects lifespan and also metabolism (e.g. reduces mitochondrial functions and activates AMPK). Although it is fairly reliable that the effects that the authors are seeing are resulting from the IMD pathway activation, it can also be a stress response caused by a combination of mifepristone treatment + IMD activation. The authors show that mifeprestone concentration of 5 µM is causing severe lethality and low body weight in DaGS>IMDCA animals. The concentration of 1 µM doesn't give the same effect, but already induces gene expression (as confirmed by imaging in Fig. S1B). Throughout the study, the concentration of 5 µM is still used and the authors claim that the phenotype seen in DaGS>IMDCA animals is suggesting that IMD activation impairs larval growth. However, can this be a case of toxicity/synthetic lethality caused by high concentration of RU486? Why wasn't 1 µM concentration used for the experiments, if it's sufficient to induce gene expression? Is there a possibility of using another temporal induction method causing less stress/toxicity for the flies? Furthermore, authors show that 1 µM mifepristone treatment shortens female lifespan, which is contradictory to the earlier literature. Citations are needed in here. Also, the decrease in female lifespan looks like it is non-significant, what statistics were used in this analysis? The methods section says OASIS2 software was used, but no further details are provided.

      Only under 10% of in DaGS>IMDCA flies exposed to 5 µM RU486 eclose, yet in Fig. 1C showing the results of body weight measurements, n=20-50. How were the DaGS>IMDCA flies obtained if under the experimental conditions only a few of them develop successfully? At which developmental stage do the flies die? Why were only male flies used for this experiment?

      More evidence is needed before concluding that the IMD lifespan effect is coming from the inflammatory intestine. TIGS driver is used to express genes of interest in the gut and fat body. No specific drivers for only the gut or only the fat body are used. Can it be claimed that the effect seen is coming purely from the gut expression? Is it possible that the fat body, which is the main organ responsible for the AMP production is actually responsible for enhanced IMD pathway target AMPs expression (as shown in Fig. S2A; the fold change is higher in the gut that in the fat body)? Was the gut not inflamed or damaged in larvae as there were no upd3 expression?

      If the authors want to state that the effect is coming from inflammatory gut and that the lifespan effect and feeding/starvation resistance effect is coming from other tissues, why did the authors still decide to use the daughterless driver to study the IMD effect on lifespan, rather than gut or fat body driver, especially if they show that the feeding rate is changed (IMD OE in neurons) as this can also affect the microbiota (which they state is because of inflammatory gut)?

      Immune responses are costly and that's one reason why their negative control is so important. The authors could state possible effects between continuously activated immune system (IMD pathway in larvae) and trade-offs in size and life-span in adult flies (+ citations to related studies). The role of constitutively activated IMD in larvae could have been confirmed by using alternative method for activating IMD, e.g. knock out of a negative regulator. Additional controls could have been used, e.g. DaGS background strain without the daughterless driver crossed with the IMDCA , or in the experiment where the gut microbiota was checked (this experiment was lacking the DaGS >LacZ + mifepristone treatment and only had DaGS>IMDCA flies with and without the mifepristone treatment). Usually in Drosophila genetics more control crosses are needed, for e.g. two different constructs of the OE IMD strains e.g. GD and KK backgrounds. The efficiency of the IMD OE could have been directly measured with qPCR and not only shown by measuring the expression of target AMPs.

      One of the conclusions drawn is that adults develop gut tissue damage as a result of inflammation. The authors could provide further evidence of this by utilizing microscopy to recognize possible changes in gut epithelia (with appropriate controls).

      The methods section could be more detailed and clearer to the reader. The statistical analyses used for e.g. survival rates should be described in more detail. The sustained alkaline phosphatase treatment should also be described in more detail, as currently the methods do not clearly state how long the flies were treated with Alp. The description of antibiotic cocktail treatment in the materials and methods should not be under the stocks and husbandry section, as it implies that all flies used were all the time maintained on an antibiotic cocktail

      Methods sections could be arranged to resemble more the order of the results sections and more details should be added. It would be challenging to repeat the experiments the way as they have been described.

      Minor comments

      The efficiency of the IMD OE was not directly measured with qPCR, only the expression of target AMPs were measured. The authors should show the activation efficiency of the IMD expression.

      Figure 1B, are these females or males?

      Fig1 E. in the transcriptome analysis the negative control should have been also treated with mifepristone

      For the experiment presented in Fig. S6, females are used, although for the majority of other experiments, only male flies are used?

      In Fig. S1C, DaGS>GFP expression is induced in 3rd instar larvae by 20 µM RU486. Is concentration this high not toxic for the larvae?

      The fact that developmental IMD activation increased DptA expression in the adult gut suggested that an irreversible change occurred in this tissue. - what is meant by irreversible change? Can this claim be made?

      Alp results are interesting. Does IAP expression decrease in adults as they age? The authors used "sustained Alp supplementation" to rescue the reduced lifespan phenotype in adults. How long were the flies treated with Alp? This should be mentioned in the materials and methods

      The description of antibiotic cocktail treatment in the materials and methods should not be under the stocks and husbandry section, as it implies that all flies used were all the time maintained on an antibiotic cocktail.

      In the qRT-PCR section, the analysis method could be added (copy number method/ΔΔCt)

      Line 49-50 is missing a reference Line 81, PGAM5 is mentioned without further explaining what it is Line 229 - what is meant by inflammatory vicious cycles? Line 314 - what is meant by thrifty phenotype?

      In figures showing lifespan, a different color code could be used where yellow and orange/red lines represent different genotypes/treatments; it is hard to visually distinguish the colors that are used at the moment

      Figure legend for Fig. 4C - AP could be written out as alkaline phosphatase already here. Also in the legend for Fig. 4 it says E twice (instead of E and then F)

      Fig. 5A - a title for x-axis could be added to make it clearer that this represents the proportion of the bacterial taxa in the gut

      Fig. S2A - LacZ is mentioned in the description but not shown in the figure

      Were there possible cross tissue contaminations in the adult gut samples? where possible contaminations checked e.g. with fatbody specific primers? This should be checked as fatbody is known to produce more AMPs when immune activated, than the gut tissue.

      CFU analysis: were the flies surface sterilized briefly in ethanol prior dissections?

      Fig2 B-C, the differences between the females and males are not drastic enough to decide to use only males later on. E. typo in starvation. DA>IMD males have decreased starvation

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      In this manuscript, the authors study the impact of ubiquitously activating the IMD pathway only during larval stages on subsequent adult life. They report a shortened lifespan due to IMD pathway activation in the larval gut and a resistance to starvation linked to its activation in the nervous system. While there is apparently no activation of the IMD pathway in very young adult flies, the expression of some IMD-dependent antimicrobial peptide (AMP) genes is reported from 7-10 flies onwards. This expression is lost upon treating the adults with antibiotics, which also rescues the shortened lifespan phenotype. It correlates with a possible increase in the proportion of Gluconobacter in the microbiota.

      While the study looks interesting, it is not clear whether the results, especially those of survival studies and RTqPCR experiments, have been replicated in independent experiments. This is essential to warrant their conclusions. In this respect, this reviewer notes some important variability in the lifespan studies (e.g., Fig. 2B vs. Fig. 4E): how do the authors account for a lifespan that is shortened almost by half in Fig. 4E? Also, Fig. S2B is not convincing given the observed variability. More data points are required to reach a conclusion.

      The authors suggest in their Discussion some kind of epigenetic mechanism transmitting the information of IMD pathway activation having occurred at larval stages. Whether this depends on a change of metabolism remains to be demonstrated, in as much it is likely that there is a major metabolic "reset" occurring during metamorphosis to prepare the individual to the new environmental conditions encountered as an adult. It is also likely that larvae in the wild grow in a microbe-rich slurry and are likely to experience intestinal infections. As noted by the authors themselves on the top paragraph of p7 (line numbers are unreadable), the larval gut is degenerated during metamorphosis and thus the enterocytes that have produced AMPs are no longer present. One possibility would be that there is an early dysbiosis already occurring during larval stages and that the young adults re-infect themselves, for instance through contact with the meconium. The authors' experiments with antibiotics are the key to this study. However, one would like to observe results of the converse experiment, that is, treating larvae with antibiotics (a better control would be to bleach the embryos to generate axenic flies) and then raising the hatched adult flies in a conventional manner. In this way, the authors may determine whether the influence of early IMD pathway activation occurs through "self" mechanisms or whether it entails a contribution from the microbiota. It might also be useful to use reporter transgenes such as Dpt-LacZ to document where in the gut IMD activation takes place in the adult and to monitor whether there is any weak signal that would not be picked up by RTqPCR in newly hatched flies.

      Specific comments

      1. The GS system used in this study requires multiple controls, as a study from the Serroude laboratory has reported a driver-dependent leakiness of expression independent of exposure to RU486 (Poirier et al., Aging Cell, 2008). Thus, it would be good to check this with a cross to a UAS-GFP driver and examining the 10 and 40-day time points. The same should be done with antibiotics-treated flies as regards DptA and Drosocin expression (Fig. 5C &D: the age of the adult flies is not specified; it would also be positive to examine the distribution of Acetobacter and Gluconobacter at 10 and 40 days).
      2. The authors state at the bottom of p6 that JAK-STAT-dependent AMP expression was detected. Fig. 4C shows a significant expression of Drsl2. As far as this reviewer recalls, Buchon et al. had demonstrated a dependence on the JAK-STAT pathway of Drsl3. It would also be worth looking at Turandot genes. As regards an involvement of the Toll pathway, it is not clear whether Drosomycin is significantly expressed as it shows a 32-fold increase in Fig. 4C, yet is not found in Table S2. This issue should be clarified using RTqPCR and it may be worth monitoring also the expression of BomS1.

      Minor points

      a) It is surprising to observe an expression driven by the TIGS2 transgene in the larval fat body as it appears to be solely expressed in the intestine in adults. In which epithelial cell type of the intestine is TIGS2 expressed?

      b) The authors have carefully defined an optimal dose of RU486 at 1 µM. Why use 20µM Fig. S1, or 50µM (Fig. S6)? Of note, the Flygutseq indicates that Alp9&10 are downregulated in enterocytes upon P. entomophila challenge.

      c) Fig. 1B&C: are the flies used in C) escapers as hardly any flies survive the 5µM RU486 challenge B)?

      d) Fig. 1D: do the authors know why there is such a difference between DptA and Drosocin?

      e) Fig. 2E: the caption does not allow to recognize which curve is LacZ RU and which one is IMD[CA] (dashed line?).

      f) Methods: the authors mention that they have dissected crop and Malpighian tubules. As no crop data are reported, does it mean that the crop and MT have been pooled in the same sample; please, clarify.

      Significance

      This study takes place in a context of the influence of infections during early life on subsequent fitness at the adult stage of organisms. With respect to mammals, it is important to note that Drosophila melanogaster undergoes a full metamorphosis that yields a thoroughly novel life form adapted to a new aerial life style. Thus, an influence of the larval stage on the imago is definitely interesting. The senior author has already published interesting work on this topic by showing that oxidative stress experienced during larval stages modifies adult fitness through an indirect action on the larval microbiota. This work is going to be of interest to investigators working on the microbiota and also on intestinal infections, let alone the community of entomologists.

      Drosophila host defense against infections, intestinal infections, host-pathogen interactions

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      This paper shows that transient genetic induction of the IMD innate immune pathway during Drosophila development, has long term effects on adult health and lifespan. The paper is well-written, the experiments are well designed and executed, and the data are without exception good quality. The data also support the specific conclusions well. The experiments take full advantage of the Drosophila system to pinpoint the effect on lifespan to long term activation of inflammation in the gut, which is interlinked and dependent upon changes in the microbiota. However the analysis is not comprehensive, because neural-specific effects on starvation resistance are not followed up, and because the etiology of the changes in microbiota is not mapped out. I should also say that I do not fully agree with the conclusion in the last sentence of the Abstract (the most important general conclusion), that the study "demonstrates a tissue-specific programming effect" of early transient IMD function. Since the lifespan shortening was shown to be dependent upon increased gut Gluconabacter, I would not call this "programming" (though the term is vague enough to mean most anything.) Instead, I would refer to the effect as a host-environment interaction. If it were "programming" of, for instance, the genetic or epigenetic sort, it would not be so easy to reverse.

      A few other minor comments:

      1. Several experiments, the authors use GFP (Fig S1) or the IMD targets DptA or Dro (Fig S2) to validate the induction of IMD-CA. Why have they not directly measured the expression of IMD-CA. This would seem to be logical and technically easy, by qPCR.
      2. In Fig 4 we see and experiment in which animals were "supplemented" with Alkaline Phosphatase, a protein. How was this done and why does it work? Is AP a gut luminal protein?
      3. The results in Fig 5 are really where the paper begins to determine a mechanism for the lifespan shortening. However, these results are rather weak, and they don't extend very far. The increase in Gluconobacter is mild (Fig 5C), and is not clear in the 16S rRNA sequencing experiment (Fig 5A). Furthermore, it is not clear that Glunconobacter specifically is the source of the lifespan shortening, of just bacteria in general (Fig 5E).

      Significance

      Although this paper addresses in interesting topic using an elegant and effective experimental strategy, the final results (Fig 5) and conclusions are modest. The analysis doesn't extend far enough to demonstrate how long term changes in microbiota arise from short term developmental changes in innate immune activity. Moreover, there is no detailed data concerning how the altered microbiota alter lifespan. Thus, while the results are interesting and the findings open avenues for further studies on the topic, the significance of the paper is modest, in its current state. Further analysis of how the microbiota is permanently changed, and why this affects lifespan, could enhance the paper. However, it is not clear that any simple, quick experiments could dramatically advance the findings from where they are now.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      We thank the reviewers for their appreciation of our work and for their constructive feedback. We have addressed their comments in the point-by-point answers below. We provide a largely revised manuscript as well as the plan for new experiments, following requests from the reviewers.

      Reviewer #1 (Evidence, reproducibility and clarity (Required)): In their manuscript, Ronchi P. et al. present a thorough and very well detailed workflow for 3D correlative-light and electron microscopy of whole cells in large tissues. Their approach of iterative block trimming and florescence imaging combined with laser branding allowed them explore previously inaccessible tissues and questions. They imaged mammary gland organoids, and resolved the organization of the cells in the organoid and mitotic events. They also specifically targeted tracheal terminal cells of a 3rd instar Drosophila larvae labeled with cytoplasmic DsRed to study their ultrastructure, and several Drosophila ovarian follicular cells (FC) where the cytoplasmic motor protein dynein was knocked down (KD) by RNAi. In the tracheal cells, they observed connected secretory vesicles, probably delivering extra-cellular matrix to the trachea tube. They also found that the overall shape of dynein KD FCs is distorted comparted to WT, and that the localization of multi-vesicular bodies/endosomes inside the FCs changed from an apical to basal membrane localization. Although the approach is not entirely new, the manuscript certainly paves the way for future studies to obtain ultrastructural information from large specimens and combine it with meaningful fluorescence information, it's also beautiful and polished.

      \*Minor comments:**

      1. The authors state that they (line 145) that they found the optimal concentration of UA and the best compromise between EM contrast and fluorescence preservation. However, no detail is provided as to how these parameters were experimentally determined. *

      UA concentration can be optimized in a number of ways, including varying incubation temperature and time. We decided to modify the speed at which the temperature was increased after the freeze substitution step at -90°C. We have experimentally compared 3°C/h vs 5°/h (described in the original on-section CLEM protocol by Kukulski et al) and found a considerable difference for some of the samples we used. This is now described in the revision (lines 152-159). While other protocols might work for some samples, we found this protocol to provide good quality imaging with a large variety of samples we have worked with (including some that are not included in the current paper, e.g. gastrulating Drosophila embryos or C. elegans larvae).

      • More detail as to how the block face was mounted and kept parallel to the glass bottom dish would be helpful. *

      This is now described in lines 182-185.

      Also, what was the optical slice of the confocal and what was the increment in Z?

      The information is now included in lines 191-192.

      • Have the authors tried fluorophores with shorter wavelength (like GFP)? And if so, have they estimated the penetration depth in resin? This would be informative because many GFP lines already exist in the Drosophila model.*

      In the current version, we have limited our study to red fluorescent proteins because UA is autofluorescent in green. This could cause problems when imaging at shorter wavelengths. We have discussed this in lines 442-444. However, we agree that an analysis of the behavior of GFP in confocal imaging of the block could improve our work and increase the potential applicability of this method. We are therefore planning an experiment to compare the behavior of EGFP and mCherry during confocal imaging of the block. This experiment will be included in a future revised version.

      • In figure 6 i, how did the authors identify the structures to be MBVs close to the basal surface in the mutant seeing as they do not look like the MVBs seen in WT cells?*

      In both cases, we identified MVBs as vesicles with a clear lumen containing one or more vesicles of homogenous size. We have included a paragraph in the Material & Methods on “multivesicular body quantification” where this is specified (lines 597-599). The only difference between MVBs of WT and KD cells was their size (shown in Fig. 6j,k,l), and therefore the identification was unambigious.

      Similarly, how were the structures identified as endosomes in figure 5f?

      We thank the reviewer for pointing this out. We agree that it is impossible to discriminate between endocytic and exocytic vesicles in our static data. We have therefore rephrased this as “membrane trafficking” (line 355, line 358, line 946).

      Can the authors quantify the total MVBs in the apical/basal membranes from both RNAi KD and WT?

      We have now segmented all MVBs in 5 KD cells and 5 neighboring WT cells in 4 different oocytes. Representative images, as well as a quantitative analysis of the distribution of MVBs, are shown in Fig. 6m-o. When we segmented MVBs for this analysis, we realized that WT cells showed large MVBs in their apical side (~5-10% of total MVBs) while in KD cells this population was almost completely absent. This is consistent with a role of dynein in MVB fusion. The data are now included in Fig. 6p. We thank the reviewer for her/his suggestion to have a more rigorous analysis of the MVBs, which allowed us to make another interesting discovery.

      • The motivation/question in the case of Drosophila samples was clear but not so obvious in the case of the mammary gland organoids. It would be nice if the authors could give a bit more information.*

      We have included a justification for the use of organoids in lines 226-235.

      • In the introduction (line 124). The dimensions are given in microns and millimeters, which can be a bit confusing. *

      We have changed this (line 132).

      • In the discussion (lines 427-431), "sample preparation protocols compatible with fluorescence preservation have proven satisfactory for FIB-SEM milling and imaging" have also been shown by others (Porrati et al., 2019).*

      We agree with the reviewer and indeed Porrati et al., 2019 was cited in the introduction. We have not claimed that we have shown this for the first time. For completeness, we cite the paper again in the discussion (line 469).

      • Figure 1:
        • It would be helpful if the cell referred to in g was highlighted.*

      As suggested, we have indicated the cell with an arrowhead.

      - Is the cell in (h) the one in g or in a as written?

      We apologize for the mistake. It is indeed the one in g. We have corrected this (line 874).

      - Is the image in (k) inverted compared to (i)?

      The image in k is not inverted compared to i. We are showing raw images of the confocal and FIB-SEM datasets and therefore the two volumes are rotated 90º with respect to each other along the Y axis. As we have realized that this can be confusing for the readers, we have introduced a sentence in Materials and Methods to describe the different orientations between confocal and FIB-SEM datasets (lines 586-589).

      *Figure 2:

      • In panel d it seems that some numbers on the x axis were duplicated.*

      We apologize for the mistake. We have corrected figure 2.

      *Figure 5:

      • How does the perfect overlap confirm the accuracy of targeting?*

      We agree with the reviewer that the overlap is not a measure of accuracy. We have removed the sentence from the legend.

      - In panel (e) it was not particularly easy to understand what is the basal lamina.

      We have manually segmented the 2 basal membranes in different colors. We hope the reviewer will find this representation clearer.

      - In panel (g) the fused vesicle is not as clear as the movie. I also found it open to interpretation whether this is in fact a fused vesicle.

      We agree with the reviewer that a 3D object can be better appreciated in the stack image sequence rather than in a single 2D image. However, to help the visualization of the event in the figure, we have shown the 3 ortho-slices in a perspective view in Fig. 5g. This was the best representation we have found. The video with the stack will be available to the readers for a better inspection.

      We also agree that it is formally impossible to be sure whether the vesicle is in fact releasing material in the apical space or taking it up. Therefore, we describe now the event as “putative site of fusion…” (line 947).

      *Reviewer #1 (Significance (Required)):

      The increasing demand for volume electron microscopy brings a lot of challenges to correlative light and volume electron microscopy workflows. Although the methods used by the authors are not new, their combination is original. The manuscript will certainly contribute to the field of correlative light and volume EM and provide a rather detailed protocol that can be reproduced by others. The workflow is also more efficient than what was previously achieved using x-ray instead of light microscopy(Bushong et al., 2015; Karreman et al., 2016).*

      We thank the reviewer for the careful examination of our work and for the positive statement. We are aware that many of the methods used have already been described by others, but we believe that their combination is original and very powerful.

      Reviewer #2* (Evidence, reproducibility and clarity (Required)):

      In this manuscript, Ronchi et al describe a workflow designed to facilitate the identification and downstream relocation of fluorescently tagged regions of interest within millimetre scale samples, ending with focused ion beam SEM acquisition of the target area. The work follows a logical progression, is well thought out, explained, and illustrated, with proof of concept experiments that are followed up by examples of systems where the potential for the application of the workflow in a 'real' biological question is demonstrated.

      For me, the title reads better as ...targeting for FIB SEM acquisition... *

      We have edited the title according to the reviewer’s suggestion

      I have only minor suggestions for the revision of the manuscript from this initial version. The introduction, and introductory paragraphs for the two model systems would benefit from some revision to make them more concise however.

      We have revised and shortened the introduction and introductory paragraphs for the model systems and we hope the reviewer will find it more concise.

      \*Summary** Line 22 - omit large. *

      Done (line 29)

      Introduction Line 66 - It's probably clearer to discuss this concept as conductivity rather than grounding. We have changed this sentence (line 76)

      Line 105 - Peddie and Collinson 2014 is not the correct reference for this statement. Presumably this is supposed to be Peddie et al 2014? ** We thank the reviewer for spotting this mistake. We have changed the citation (line 1110)

      Line 124 - The external diameter of the carrier would give 7 mm2, but the internal diameter is smaller, so this size is slightly overstated.

      We totally agree. The internal diameter of the carrier is 2mm and therefore the area 3.14 mm2. We have corrected the statement (line 132).

      Results General comment - I find the use of NxNxN/N nm3 to be a confusing way of expressing the measurements, so would suggest splitting these up to express as: N nm3 or NxNxN nm.

      To avoid confusion, we have now opted for: N nm x N nm x N nm.

      Line 141 - no water was used in the FS mixture, and so wasn't needed for preservation of fluorescence? Dry/100% acetone? If no water is needed, this detail should be discussed. We added a clarification of this point (lines 148-150)

      Line 142 - could the authors elaborate on the statement about timing and sample types, to give a better understanding of the context. The sentence referred to other possible applications (e.g. cell monolayers would require shorter FS time). However, as the method described here is aimed at large 3D samples, we find that longer FS times (72h) are always required. We have therefore removed the sentence (line 151).

      Line 150 - on the choice of fluorophores, did the authors examine any shorter wavelengths, or was the decision to use red/far red based on any other evidence? Anecdotally, red and far red fluorophores may offer better preservation and less longevity in this context, but could the authors elaborate on their reasoning behind the choice shown here?

      As replied to reviewer 1, point 3:

      In the current version, we have limited our study to red fluorescent proteins because UA is autofluorescent in green. This could cause problems when imaging at shorter wavelengths. We have discussed this in lines 442-444.

      However, we agree that an analysis of the behavior of GFP in confocal imaging of the block could improve our work and increase the potential applicability of this method. We are therefore planning an experiment to compare the behavior of EGFP and mCherry during confocal imaging of the block. This experiment will be included in a future revised version.

      Line 168 - did immersion in water give rise to any distortion of the resin, or is HM20 sufficiently hydrophobic that this was not a concern? Mismatches in refractive indices (resin, water, glass, oil) could also presumably give rise to some small inaccuracies in depth prediction?

      We observed a little distortion of the block face, due to hydration during the imaging step. However, as noticed during trimming at the microtome, this distortion was small and we could achieve a flat surface after removing 1-2 mm. Therefore this was not relevant for our measurements. We however now mention this in the discussion (lines 453-455). Mismatches of refractive indices also introduce inaccuracies, but these aberrations are reduced the closer the target is to the surface. Therefore, our predictions become more precise after each trimming step to approach the target.

      Line 169 - was it possible to quantify the increase in signal? If the block is being hydrated, but the block is not absorbing water (re above point), then it must only be surface fluorophores that are hydrated

      The quantified increase in fluorescence signal at the surface is now mentioned here (line 187) and can be observed in Fig. 2b. Indeed, only surface fluorophores are hydrated and we argue that this is an important player in the fluorescence intensity increase.

      Line 179 - presumably this is a result of the surface of the block being hydrated (re above points). This is mentioned later, but could be explicitly stated here to make the point more strongly.

      We now state this also in line 186.

      Line 188 - Peddie et al 2014 contains some limited data for mCherry in sections that could be worth mentioning in support of the findings of reduced photobleaching rates

      Thank you for pointing this out. We now cite Peddie et al 2014 (line 208-209)

      Line 268 - It is not explicitly stated earlier, but multiple targets at similar depths would also be possible, presumably We have included a sentence to address this possibility (line 292-293)

      Discussion Line 421 - sections cannot be repeatedly imaged without bleaching too much? Please elaborate on this statement to help strengthen the point as it isn't mentioned earlier in the results.

      Our experience with in section fluorescence imaging is that fluorescent proteins are not very stable and bleach rather quickly. However, as we have not measured this with the same setup and with the same samples, we do not have a rigorous proof for this statement. As we believe the comparison with sections is not an important point here, we have removed the sentence (line 463)

      Line 435 - FIB SEMs and 2Pi systems are not really so 'common' in the sense suggested, so this final statement should be reworded.

      We have changed the sentence (lines 475-477)

      M&Ms Line 540 - grooves, not groves

      Changed (line 589)

      Figure 3 legend Overall, it's a workflow comprising many methods, so it's best described as a schematic of the workflow.

      Changed (line 900)

      Confocal panel - target, not targets, and depth is misspelt.

      We thank the reviewer for spotting these mistakes. We have corrected the figure*

      *

      Figure 4 legend Line 834 - as far as I can see, this is a different organoid that isn't shown in a and b, so this text should be removed.

      The organoid is indeed a different one. We meant that the targeting was performed as shown in a and b. However, as the sentence could generate confusion, we have removed it (line 932).

      Figure 5 legend Line 840 - was, not is

      Changed (line 937)

      Figure 6 a) It would help with clarity to also put e.g. white arrows on the WT epithelium

      As we use arrows and arrowheads to indicate different events in the image, we have used green asterisks to label the nucleus of the WT cell and a red asterisk for the KD, as we have done in all the panels in figure 6, where both cell types are present in the same image.

      f,g) It isn't really clear on first viewing what these images show, so they would benefit from some labels.

      We have added labels to indicate all the cells represented in the images as well as the space in between (VM, vitelline material). Microvilli are now indicated with arrowheads. We have also explained in the figure legend that here we show in detail the structures indicated by black arrows in Fig. 6a, to help give a context to the high mag detail (lines 964-965).

      \*Minor stylistic comments** There should be a space between numbers and units; this is inconsistent throughout. *

      We have corrected this.

      The use of black versus white text on the figures is inconsistent.

      We have fixed this.

      Table 1 - is it in the supplementary material or not? If it is, it should be referenced as such in the text. The formatting could use some refinement to match the standard of the other figures.

      The table is supplementary material. We have now referenced it as such and we have reformatted it.

      Capitalisation is inconsistent throughout.

      We have revised the text.

      The manuscript describes a workflow that connects several pre-existing methods to enable precision targeting of individual fluorescently tagged structures within a larger sample volume. The possibility for multi-modal imaging within a single embed specimen facilitates correlation of data for structure, with that of function. The work will be of interest to all scientists in the field of correlative microscopy

      We thank the reviewer for her/his positive evaluation.

      Reviewer #3* (Evidence, reproducibility and clarity (Required)):

      The manuscript is written very clearly overall. I would like to raise a number of issues that the authors might address. Most are at the level or proof-reading.

      The workflow still depends on availability of a specialized confocal microscope with two-photon laser excitation for marking the region of interest. A tweak to the method might simply be to scratch or etch markings onto the planed surface near the edges. Provided a motorized stage is available on the light microscope, the region of interest could be located precisely with reference to those, and then relocated in the SEM. It would be enough to suggest this, or another similar method, for those who don't have access to the two-photon microscope.*

      In our view, the 2pi branding is important to position the FIB-SEM acquisition with high precision, reliability and confidence. However, we agree with the reviewer that there are other approaches to accomplish this task, which we now mention in the text. One is to simply measure the distance from the edges or corners of the block (lines 256-259). Another, could be to manually introduce landmarks (lines 259-260).

      The second is to clarify in the text that the top-down view of the confocal microscopy is orthogonal to that of the FIB. This appears as a note in the caption to Figure 1, but it is an important point to align the expectations of readers who are not closely familiar with the methods.

      We agree with the reviewer that this is a point that requires further clarification. We have described this in Materials & Methods in the paragraph “Image processing, dataset registration, visualization and segmentation” (lines 586-589).

      The legend labels in Figure 1 do not match the figure itself, as if it were recompiled from an earlier draft: g-j) refers next to a).

      We apologize for the mistake. We have corrected it.

      The decrease in fluorescence intensity with depth into the specimen remains a bit ambiguous. The significant part of the text is dedicated to the suggestion that inherent protein fluorescence is affected by water content in the resin. After cutting back from the surface, are the originally deeper layers still dim, or do they become brighter? In other words, is the effect chemical or optical?

      As we wrote in the discussion, probably both optical effects and hydration play a role in the observed fluorescence drop. The hydration we describe probably only takes place on the block surface when dipping the block in water for imaging. Therefore, when we expose deeper layers after removing the resin on top, they do become brighter. However, we cannot completely disentangle the optical and hydration effect. To make this clearer, we have explained the point in more detail in the discussion (lines 452-455). At the same time, we are planning a new experiment to compare the fluorescence signal in the presence or absence of water in the dish, which will allow us to discriminate between the two effects.

      • Loss of confocal intensity with depth would be expected on the basis of a refractive index mismatch to the design parameters of the objective, especially for high numerical aperture. The objective is specified as multi-immersion but no further details are given. *

      Details of the lense we used are now given in Materials and Methods (lines 539-540)

      Another easy test would be to embed fluorescent beads as intensity standards. There could also be absorption of the fluorescence emission by the resin and stain, but such a strong effect in a few tens of microns would suggest that the block is quite dark. That seems inconsistent with the images in supplementary figures. Personally I was not bothered by the dimming in depth, since the conclusions do not depend on quantitative fluorescence intensities.

      We agree with the reviewer that, although the fluorescence intensity drop is an effect that is worth describing because it has an implication for the identification of fluorescent targets in the block, our method does not rely on quantitative imaging. In all cases, we were able to detect fluorescence signal even very deep from the block surface and this was enough to target those cells at the FIB-SEM.

      In some cases pre-embedding correlative imaging can be quite successful, for example in studies of Jost Enninga (e.g., Mellouk et al, Cell Host & Microbe 2014) or Eric Jorgensen (Watanabe et al, Nature Methods 2011). Do the authors see a distinction between adherent

      cell cultures and unsupported tissues or tissue sections?

      We completely agree with the reviewer that pre-embedding CLEM can be extremely successful and it is a very valuable tool, especially for the study of dynamic event is cell cultures. However, while for adherent cells the targeting is essentially a 2D problem and is facilitated by the fact that cells can be identified on the surface of the block under the SEM beam, for larger 3D samples the situation is much more complicated. We often lack landmarks and surface references and an anisotropic deformation occurs during sample prep, making targeting and localization prediction extremely inaccurate.

      Other investigators have insisted that FIB-SEM requires especially heavy labelling. What was done differently here to make the light labeling possible? Such clues may be very useful to ongoing developments in the literature. Also, the present protocol skips osmium staining entirely. The authors must have compared images with and without osmium. What visible features do we lose as a result?

      We provide a detailed freeze substitution protocol in table S1, such that the method can be easily reproduced. Although FIB-SEM imaging of osmium-free samples is not very common, it has been shown by others before (Porrati et al., 2019), with a slightly different FS protocol.

      We found that our sample preparation is good enough for the detection of all membranous organelles, but also microtubules, centrioles and other subcellular structures. We did not observe any big difference compared to the more standard protocols containing osmium (line 136).

      Perhaps the greatest challenge to large volume electron microscopy is to deal with rare events. Correlative fluorescence light-electron microscopy effectively addresses the issue of finding the region of interest in a two-dimensional specimen such as a thin section or even a monolayer cell culture. For tissues the solutions are still at large. It is almost always impractical to image an entire organ at the resolution required to see macromolecules (work of Harald Hess being the exception that proves the rule). The issue is especially acute where the imaging is destructive, as in the case of serial block-face and FIB-SEM tomography. MicroCT has been used so far as the method of choice in the work-up to locate the region of interest within a large specimen, but the approach requires expensive equipment and time-consuming analysis. Furthermore, it can provide directional clues solely on the basis of morphology. Fluorescence would be a far simpler tool, and more informative when labeling is directed to specific molecular components. The manuscript of Ronchi et al provides a much-needed demonstration and detailed set of instructions for 3D CLEM en route to FIB-SEM volume imaging. The examples are presented are both convincing and esthetic. Success depended on integration of a number of factors, including changes to the specimen preparation, so the workflow will be very useful. In short, I recommend publication.

      We thank the reviewer for the generous comments.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      The manuscript is written very clearly overall. I would like to raise a number of issues that the authors might address. Most are at the level or proof-reading.

      The workflow still depends on availability of a specialized confocal microscope with two-photon laser excitation for marking the region of interest. A tweak to the method might simply be to scratch or etch markings onto the planed surface near the edges. Provided a motorized stage is available on the light microscope, the region of interest could be located precisely with reference to those, and then relocated in the SEM. It would be enough to suggest this, or another similar method, for those who don't have access to the two-photon microscope.

      The second is to clarify in the text that the top-down view of the confocal microscopy is orthogonal to that of the FIB. This appears as a note in the caption to Figure 1, but it is an important point to align the expectations of readers who are not closely familiar with the methods.

      The legend labels in Figure 1 do not match the figure itself, as if it were recompiled from an earlier draft: g-j) refers next to a).

      The decrease in fluorescence intensity with depth into the specimen remains a bit ambiguous. The significant part of the text is dedicated to the suggestion that inherent protein fluorescence is affected by water content in the resin. After cutting back from the surface, are the originally deeper layers still dim, or do they become brighter? In other words, is the effect chemical or optical? Loss of confocal intensity with depth would be expected on the basis of a refractive index mismatch to the design parameters of the objective, especially for high numerical aperture. The objective is specified as multi-immersion but no further details are given. Another easy test would be to embed fluorescent beads as intensity standards. There could also be absorption of the fluorescence emission by the resin and stain, but such a strong effect in a few tens of microns would suggest that the block is quite dark. That seems inconsistent with the images in supplementary figures. Personally I was not bothered by the dimming in depth, since the conclusions do not depend on quantitative fluorescence intensities.

      In some cases pre-embedding correlative imaging can be quite successful, for example in studies of Jost Enninga (e.g., Mellouk et al, Cell Host & Microbe 2014) or Eric Jorgensen (Watanabe et al, Nature Methods 2011). Do the authors see a distinction between adherent cell cultures and unsupported tissues or tissue sections?

      Other investigators have insisted that FIB-SEM requires especially heavy labelling. What was done differently here to make the light labeling possible? Such clues may be very useful to ongoing developments in the literature. Also, the present protocol skips osmium staining entirely. The authors must have compared images with and without osmium. What visible features do we lose as a result?

      Perhaps the greatest challenge to large volume electron microscopy is to deal with rare events. Correlative fluorescence light-electron microscopy effectively addresses the issue of finding the region of interest in a two-dimensional specimen such as a thin section or even a monolayer cell culture. For tissues the solutions are still at large. It is almost always impractical to image an entire organ at the resolution required to see macromolecules (work of Harald Hess being the exception that proves the rule). The issue is especially acute where the imaging is destructive, as in the case of serial block-face and FIB-SEM tomography. MicroCT has been used so far as the method of choice in the work-up to locate the region of interest within a large specimen, but the approach requires expensive equipment and time-consuming analysis. Furthermore, it can provide directional clues solely on the basis of morphology. Fluorescence would be a far simpler tool, and more informative when labeling is directed to specific molecular components. The manuscript of Ronchi et al provides a much-needed demonstration and detailed set of instructions for 3D CLEM en route to FIB-SEM volume imaging. The examples are presented are both convincing and esthetic. Success depended on integration of a number of factors, including changes to the specimen preparation, so the workflow will be very useful. In short, I recommend publication.

      I would like to raise a number of issues that the authors might address. Most are at the level or proof-reading.

      The workflow still depends on availability of a specialized confocal microscope with two-photon laser excitation for marking the region of interest. A tweak to the method might simply be to scratch or etch markings onto the planed surface near the edges. Provided a motorized stage is available on the light microscope, the region of interest could be located precisely with reference to those, and then relocated in the SEM. It would be enough to suggest this, or another similar method, for those who don't have access to the two-photon microscope.

      The second is to clarify in the text that the top-down view of the confocal microscopy is orthogonal to that of the FIB. This appears as a note in the caption to Figure 1, but it is an important point to align the expectations of readers who are not closely familiar with the methods.

      The legend labels in Figure 1 do not match the figure itself, as if it were recompiled from an earlier draft: g-j) refers next to a).

      The decrease in fluorescence intensity with depth into the specimen remains a bit ambiguous. The significant part of the text is dedicated to the suggestion that inherent protein fluorescence is affected by water content in the resin. After cutting back from the surface, are the originally deeper layers still dim, or do they become brighter? In other words, is the effect chemical or optical? Loss of confocal intensity with depth would be expected on the basis of a refractive index mismatch to the design parameters of the objective, especially for high numerical aperture. The objective is specified as multi-immersion but no further details are given. Another easy test would be to embed fluorescent beads as intensity standards. There could also be absorption of the fluorescence emission by the resin and stain, but such a strong effect in a few tens of microns would suggest that the block is quite dark. That seems inconsistent with the images in supplementary figures. Personally I was not bothered by the dimming in depth, since the conclusions do not depend on quantitative fluorescence intensities.

      In some cases pre-embedding correlative imaging can be quite successful, for example in studies of Jost Enninga (e.g., Mellouk et al, Cell Host & Microbe 2014) or Eric Jorgensen (Watanabe et al, Nature Methods 2011). Do the authors see a distinction between adherent cell cultures and unsupported tissues or tissue sections?

      Other investigators have insisted that FIB-SEM requires especially heavy labelling. What was done differently here to make the light labeling possible? Such clues may be very useful to ongoing developments in the literature. Also, the present protocol skips osmium staining entirely. The authors must have compared images with and without osmium. What visible features do we lose as a result?

      Significance

      Perhaps the greatest challenge to large volume electron microscopy is to deal with rare events. Correlative fluorescence light-electron microscopy effectively addresses the issue of finding the region of interest in a two-dimensional specimen such as a thin section or even a monolayer cell culture. For tissues the solutions are still at large. It is almost always impractical to image an entire organ at the resolution required to see macromolecules (work of Harald Hess being the exception that proves the rule). The issue is especially acute where the imaging is destructive, as in the case of serial block-face and FIB-SEM tomography. MicroCT has been used so far as the method of choice in the work-up to locate the region of interest within a large specimen, but the approach requires expensive equipment and time-consuming analysis. Furthermore, it can provide directional clues solely on the basis of morphology. Fluorescence would be a far simpler tool, and more informative when labeling is directed to specific molecular components. The manuscript of Ronchi et al provides a much-needed demonstration and detailed set of instructions for 3D CLEM en route to FIB-SEM volume imaging. The examples are presented are both convincing and esthetic. Success depended on integration of a number of factors, including changes to the specimen preparation, so the workflow will be very useful. In short, I recommend publication.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      In this manuscript, Ronchi et al describe a workflow designed to facilitate the identification and downstream relocation of fluorescently tagged regions of interest within millimetre scale samples, ending with focused ion beam SEM acquisition of the target area. The work follows a logical progression, is well thought out, explained, and illustrated, with proof of concept experiments that are followed up by examples of systems where the potential for the application of the workflow in a 'real' biological question is demonstrated.

      For me, the title reads better as ...targeting for FIB SEM acquisition...

      I have only minor suggestions for the revision of the manuscript from this initial version. The introduction, and introductory paragraphs for the two model systems would benefit from some revision to make them more concise however.

      Summary

      Line 22 - omit large.

      Introduction

      Line 66 - It's probably clearer to discuss this concept as conductivity rather than grounding.

      Line 105 - Peddie and Collinson 2014 is not the correct reference for this statement. Presumably this is supposed to be Peddie et al 2014?

      Line 124 - The external diameter of the carrier would give 7 mm2, but the internal diameter is smaller, so this size is slightly overstated.

      Results

      General comment - I find the use of NxNxN/N nm3 to be a confusing way of expressing the measurements, so would suggest splitting these up to express as: N nm3 or NxNxN nm.

      Line 141 - no water was used in the FS mixture, and so wasn't needed for preservation of fluorescence? Dry/100% acetone? If no water is needed, this detail should be discussed.

      Line 142 - could the authors elaborate on the statement about timing and sample types, to give a better understanding of the context.

      Line 150 - on the choice of fluorophores, did the authors examine any shorter wavelengths, or was the decision to use red/far red based on any other evidence? Anecdotally, red and far red fluorophores may offer better preservation and less longevity in this context, but could the authors elaborate on their reasoning behind the choice shown here?

      Line 168 - did immersion in water give rise to any distortion of the resin, or is HM20 sufficiently hydrophobic that this was not a concern? Mismatches in refractive indices (resin, water, glass, oil) could also presumably give rise to some small inaccuracies in depth prediction?

      Line 169 - was it possible to quantify the increase in signal? If the block is being hydrated, but the block is not absorbing water (re above point), then it must only be surface fluorophores that are hydrated and give rise to this increase in signal?

      Line 179 - presumably this is a result of the surface of the block being hydrated (re above points). This is mentioned later, but could be explicitly stated here to make the point more strongly.

      Line 188 - Peddie et al 2014 contains some limited data for mCherry in sections that could be worth mentioning in support of the findings of reduced photobleaching rates.

      Line 268 - It is not explicitly stated earlier, but multiple targets at similar depths would also be possible, presumably.

      Discussion

      Line 421 - sections cannot be repeatedly imaged without bleaching too much? Please elaborate on this statement to help strengthen the point as it isn't mentioned earlier in the results.

      Line 435 - FIB SEMs and 2Pi systems are not really so 'common' in the sense suggested, so this final statement should be reworded.

      M&Ms

      Line 540 - grooves, not groves

      Figure 3 legend Overall, it's a workflow comprising many methods, so it's best described as a schematic of the workflow. Confocal panel - target, not targets, and depth is misspelt.

      Figure 4 legend Line 834 - as far as I can see, this is a different organoid that isn't shown in a and b, so this text should be removed.

      Figure 5 legend Line 840 - was, not is

      Figure 6 a) It would help with clarity to also put e.g. white arrows on the WT epithelium f,g) It isn't really clear on first viewing what these images show, so they would benefit from some labels.

      Minor stylistic comments

      There should be a space between numbers and units; this is inconsistent throughout. The use of black versus white text on the figures is inconsistent. Table 1 - is it in the supplementary material or not? If it is, it should be referenced as such in the text. The formatting could use some refinement to match the standard of the other figures. Capitalisation is inconsistent throughout.

      Significance

      The manuscript describes a workflow that connects several pre-existing methods to enable precision targeting of individual fluorescently tagged structures within a larger sample volume. The possibility for multi-modal imaging within a single embed specimen facilitates correlation of data for structure, with that of function. The work will be of interest to all scientists in the field of correlative microscopy.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      In their manuscript, Ronchi P. et al. present a thorough and very well detailed workflow for 3D correlative-light and electron microscopy of whole cells in large tissues. Their approach of iterative block trimming and florescence imaging combined with laser branding allowed them explore previously inaccessible tissues and questions. They imaged mammary gland organoids, and resolved the organization of the cells in the organoid and mitotic events. They also specifically targeted tracheal terminal cells of a 3rd instar Drosophila larvae labeled with cytoplasmic DsRed to study their ultrastructure, and several Drosophila ovarian follicular cells (FC) where the cytoplasmic motor protein dynein was knocked down (KD) by RNAi. In the tracheal cells, they observed connected secretory vesicles, probably delivering extra-cellular matrix to the trachea tube. They also found that the overall shape of dynein KD FCs is distorted comparted to WT, and that the localization of multi-vesicular bodies/endosomes inside the FCs changed from an apical to basal membrane localization. Although the approach is not entirely new, the manuscript certainly paves the way for future studies to obtain ultrastructural information from large specimens and combine it with meaningful fluorescence information, it's also beautiful and polished.

      Minor comments:

      1. The authors state that they (line 145) that they found the optimal concentration of UA and the best compromise between EM contrast and fluorescence preservation. However, no detail is provided as to how these parameters were experimentally determined.
      2. More detail as to how the block face was mounted and kept parallel to the glass bottom dish would be helpful. Also, what was the optical slice of the confocal and what was the increment in Z?
      3. Have the authors tried fluorophores with shorter wavelength (like GFP)? And if so, have they estimated the penetration depth in resin? This would be informative because many GFP lines already exist in the Drosophila model.
      4. In figure 6 i, how did the authors identify the structures to be MBVs close to the basal surface in the mutant seeing as they do not look like the MVBs seen in WT cells? Similarly, how were the structures identified as endosomes in figure 5f? Can the authors quantify the total MVBs in the apical/basal membranes from both RNAi KD and WT?
      5. The motivation/question in the case of Drosophila samples was clear but not so obvious in the case of the mammary gland organoids. It would be nice if the authors could give a bit more information.
      6. In the introduction (line 124). The dimensions are given in microns and millimeters, which can be a bit confusing.
      7. In the discussion (lines 427-431), "sample preparation protocols compatible with fluorescence preservation have proven satisfactory for FIB-SEM milling and imaging" have also been shown by others (Porrati et al., 2019).
      8. Figure 1:
      • It would be helpful if the cell referred to in g was highlighted.
      • Is the cell in (h) the one in g or in a as written?
      • Is the image in (k) inverted compared to (i)? Figure 2:
      • In panel d it seems that some numbers on the x axis were duplicated. Figure 5:
      • How does the perfect overlap confirm the accuracy of targeting?
      • In panel (e) it was not particularly easy to understand what is the basal lamina.
      • In panel (g) the fused vesicle is not as clear as the movie. I also found it open to interpretation whether this is in fact a fused vesicle.

      Significance

      The increasing demand for volume electron microscopy brings a lot of challenges to correlative light and volume electron microscopy workflows. Although the methods used by the authors are not new, their combination is original. The manuscript will certainly contribute to the field of correlative light and volume EM and provide a rather detailed protocol that can be reproduced by others. The workflow is also more efficient than what was previously achieved using x-ray instead of light microscopy(Bushong et al., 2015; Karreman et al., 2016).

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Response to Reviewers

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      **Major points:**

      • The affinity analyses need more work. This is against A/B/C isoforms, and also the dimerization affinity between the fluorescent proteins could change the apparent on/off rates. This point is not quantified or discussed. Due to the chemical equilibrium analysis, the apparent equilibrium is not only affected by this on/off rates, but also the local availability (concentrations) of the reacting moieties. In the limit where the biosensor concentration is low within a cellular subcompartment or vice versa, how this is going to change the sensitivity of detection because this can push the reaction in either directions. Since equimolar distribution of the moieties are not guaranteed, this affects the detection characteristics of this biosensor. This point should be discussed and emphasized. Regarding the A/B/C isoforms: We did not mean to claim, that the sensor is specific for RhoA, based on the literature, we are certain it will also bind Rho B and C. We observed binding to active RhoB in an experiment not shown in the manuscript. To make this clearer, we changed the name of the Rho GTPase to Rho. Regarding the dimerization affinity: Some initial data has been acquired for the weaker dimers Venus and iRFP. They seem to have a slightly beneficial effect but less beneficial than the stronger dimer dTomato. We agree that the biosensor concentration affects the performance (which is an important point with respect to optimizing the right concentration, as will be discussed later). We think that the local availability is not limiting because of fast diffusion of the soluble biosensor. However, this may be an issue in highly polarized cell types such as neurons. This is added to the discussion: ‘The biosensor concentration of relocation probes affects their performance. Although the diffusion of a soluble probe will not readily lead to differences in local availability in most cell types, this may be an issue in highly polarized cell types.’

      • Fig 1 A: Are the fluorescence changes of the biosensors due to stimulation with histamine completely reversible ? In other words, is it possible to see a total recovery of the signals with pyrilamine or in the presence of another antagonist ? If not, why?

      This is typically what we observe for this antagonist. Although it is added at a saturating concentration, it cannot completely switch of the Rho GTPase activity. This has also been observed with a DORA FRET sensor (Figure 4B in: https://doi.org/10.1124/mol.116.104505)

      Does histamine stimulation induce a maximal activation of RhoA in HeLa cells? What happens in terms of fluorescence changes when the activity of RhoA is inhibited or in the presence of a Gαq-inhibitor, and in conditions in which RhoA activating GEF, RhoA GAP or RhoA GDI is overexpressed ? Generally, I think it is useful to have a calibration curve of the biosensors activity, maximal/minimal (ON/OFF) response. For exemple, it would help to answer the question concerning biosensors binding affinity for RhoA ("The function of rhotekin is not clear, it seems to lock RhoA in the GTP bound state (Ito et al., 2018; Reid et al., 1996). We can only speculate that rhotekin has a stronger binding affinity for active RhoA than anillin and PKN1 have." (p.15))

      We have optimized our system to achieve high Rho activation and this has previously allowed us to do a quantitative comparison of the contrast of RhoA FRET sensors (see supplemental material of: https://doi.org/10.1038/srep14693). Whether this is a maximal response is unclear, but we do observe robust and consistently strong responses, which were not achieved by other strategies.

      What is the effect of histamine stimulation on a membrane marker expression/location ?

      We propose to perform an additional experiment, measuring the fluorescent intensity for a cytosolic fluorescent protein in the HeLa cell histamine stimulation assay, since we measure the depletion in fluorescent intensity of the sensor in the cytosol.

      What is the effect of histamine stimulation on dT2xrGBD biosensor response when this one is forced to be located in other subcellular compartments (mitochondria, nucleus) by fusing the construct to targeting sequences.

      We have not tried this experiment and we are not sure what would be the point of that experiment? If the construct would be forced to localize, we would not observe relocalization.

      Physiological control: Effect of the presence of the biosensor in cell morphology/behavior... Experimental data concerning this point are evoked in the discussion section. "We demonstrate that low expression of the biosensor, through the truncated CMV promotor, did not inhibit cell division and cell edge retraction. Plus, endothelial cells expressing the sensor still show the typical reaction of contracting followed by spreading, when stimulated with thrombin. Low expression results in a low fluorescent signal of the sensor." (p.16) I think this results would deserve a section in this manuscript.

      This is the data shown in Figure 6, we will refer to it more clearly.

      Fig 2D : "The anillin sensor AHD+PH showed a 15% decrease in cytosolic intensity (Figure 2D), but it also relocalizes to striking punctuate structures upon histamine stimulation. These structures did not seem to represent local, high activity of RhoA, as the optimized rGBD sensor in the same cell showed no such locally clustered RhoA activation, but rather a homogenous activation at the membrane and a 60% drop in cytosolic intensity. Similar punctuate structures were observed in endothelial cells, when stimulated with the strong RhoA activator thrombin (Supplemental Movie 5)." And p. 15 : "However, we noticed that the AHD+PH sensor, containing aGBD, C2 and PH domain, localizes in a punctate manner. These 'dots' were observed in both HeLa cells and endothelial cells and were only observed with the AHD+PH RhoA sensor. As aGBD does not localize in puncta, it seems that the localization is caused by domains other than of the RhoA binding domain, i.e. the C2- and/or PH-domain." Punctate structures are also present in HeLa cells expressing the anillin sensor before histamine stimulation (see Supplemental Movie 4). Moreover, punctuate pattern activated by thrombin in endothelial cells looks different (more widespread) than the one activated by histamine in HeLA cells. In addition, these structures can also be found in human endothelial cells expressing dT2xrGBD (fig. 6B, Supplemental movie 10). What are those structures thrombin activated in endothelial cells that would be similar to the ones in Hela cells activated by histamine and that "did not seem to represent local, high activity of RhoA"? This is not further commented by the authors.

      Very well spotted. What can be seen in Figure 6B and SMovie 10, are different vesicles, that are always observed in endothelial cells expressing fluorescent proteins. We think they are endosomes/lysosomes, which would explain why especially the more pH stable red fluorescent proteins are visible in these structures. They do not localize at the membrane but in the cytosol. These structure are not induced by RhoA activation, and are not present in the TIRF data which excludes the cytosol.

      • Fig 3A: "The rGBD sensors solely colocalized in the nucleus with RhoA but not with Rac1 and Cdc42, indicating that rGBD specifically binds constitutively active RhoA." What about dT2xrGBD binding specificity for the three homologues RhoA, RhoB and RhoC? This point is evoked in the discussion part (p.16) but there is no experimental data to support it "The specificity of the relocation sensor is determined by the binding specificity of the GBD. The rGBD binds the three homologues RhoA, B and C but not to Rac1 and Cdc42". So, why rGBD is presented as a RhoA biosensor?

      We apologize for this misunderstanding. We have no reason to assume that the biosensor does not bind all three isoforms. We will refer to the RhoA/B/C isoforms as ‘Rho’ and we will call it a Rho sensor.

      Fig 3B: The data scatter for the dTomato-2xrGBD is very wide compared to the mScarlet-1xrGBD. What is causing this wide data scatter and such heterogeneous response? This is a problem if the sensor is really so heterogeneously responding to a strong mutant of RhoA, is this a dimerization-dependent problem?

      We think that this is related to expression levels. Since dTomato-2xrGBD shows higher amplitudes, the spread also becomes larger and so we think the coefficient of variation will be similar. We will add standard deviations an indicate fluorescent intensity.

      These domain-based biosensors could cause dominant negative/inhibitory artefacts. Also the dimerizing fluorescent proteins could introduce oligomerization of the signaling complex which is not real in cells and clearly affect phenotype. These issues should be tested and addressed by a quantitative measure of cell behavior against increasing concentration/changing dimerization potentials of the biosensor in live cell assays.

      We agree that these type of biosensors in a general sense can cause dominant negative/inhibitory artefacts and we explicitly mention this in the text: “Visualizing the endogenous Rho activity may interfere with the biological role of Rho, as the sensor binds endogenous Rho and may compete with natural effectors of Rho”

      We were worried about this possible downside and have been very carefully looking at the effects of the biosensor. As highlighted in the manuscript, we noticed mitosis and natural contraction/spreading of endothelial cells. We were able to make stable cell lines. These are all signs that there are no strong negative effects. We also advice to use low expression of the senor to limit negative effects: “To limit the perturbation, the sensor should be expressed at a low level to allow Rho signaling”

      Fig 4 C: "Given the successful improvement of the rGBD-based biosensor by increasing the number of binding domains, we explored whether the same strategy can be applied to the G protein binding domains from PKN1 and Anillin" and "The dimericTomato-2xrGBD sensor shows the best relocation efficiency, with a median change in cytosolic intensity of close to 50%"... So why the dT-2xaGBD construct has not been tried ?

      Because we did not see the stepwise improvement as we saw for the rGBD sensor, so we do not expect an improvement in that construct. Plus, the cloning for the 2xaGBD was initially not working out.

      p.9 : "None of the pGBD sensors showed a clear membrane localization upon stimulation with histamine (Figure 4A). The increase in cytosolic intensity observed in some cells, seems to be caused by changes in cell shape." Do changes in HeLa cell shape induced by histamine stimulation? How this can be explained? Do some cells expressing the rGBD sensors (single, tandem and triple and dimericTomato) undergo these changes of shape too, upon histamine stimulation? If yes, to what extent these changes in cell shape affect signals?

      The activation of Rho GTPases by the histamine receptor often results in changes in cell shape in HeLa cells. We propose to perform an additional experiment with a cytosolic fluorescent protein in the HeLa cell histamine stimulation assay, to measure potential intensity changed solely caused by shape changes.

      p9: Overall, the paragraph about Fig 4 E,F is not clear. What amino acid sequences of G Protein Binding Domains of Anillin and PKN1 bring for the understanding of rGbD, aGBD and pGBD sensors?

      Since there is no crystal structure for rGBD available, we thought it is interesting to compare the amino acid sequences to see how similar/ different these domains are.

      p. 12, Fig 6C, Fig. 6E: "The membrane marker showed a relatively small increase in intensity after stimulation and the curve did not show the same pattern as the RhoA biosensor intensity curve. Therefore, we conclude that the increase in RhoA biosensor intensity is caused by relocalization." It surprises me that decrease in cell areas induced a very small increase in fluorescence intensity of the membrane marker. It would be very helpful to see a figure with a quantification of the membrane marker intensity changes during this process. What about a cytoplasmic marker?

      Figure 6D shows the intensity measurements of the membrane marker intensity. The small change can be caused by membrane changes, but also other factors that affect intensity (focus change). We will add the membrane intensity measurements to Figure 6F and G as well. Since these measurements are made in TIRF, the intensity of the cytoplasmic marker would be very low. Therefore, we decided to use a membrane marker.

      In addition, how does the movement artefact is corrected?

      The ROIs were drawn by hand to measure the fluorescence intensity.

      "Our data revealed that the RhoA biosensor displays RhoA activity at subcellular locations where RhoA activity is expected, and appears mostly independent of fluorescent intensity measured by a separate membrane marker." This part should be developed further. Are there examples of cells for which the biosensor activity is dependent on fluorescent intensity measured by a separate membrane marker?

      The intensity of the membrane marker is only affected by changes in membrane area or morphology (and other technical reasons that lead to a change in intensity, e.g. focal drift, bleaching). This point is made in the paper by Dewitt that we cite (https://doi.org/10.1083/jcb.200806047). We are not aware of papers that show biosensor activity dependent on a separate membrane marker. One potential confounding issue is quenching of the membrane marker by FRET, but this would lead to a decrease in intensity and we do not observe that.

      Discussion (p.16): "Comparing relocation sensors to FRET sensors, both have their own advantages and disadvantages." The dT2xrGBD sensor is here presented as a new relocation sensor for RhoA activity. However in general, there should be more development of the direct comparisons, pros and cons, with quantitative data or more details allowing to have a general overview of the advantages and disadvantages of this new relocation biosensor as compared to the existing ones.

      We explain the pros and cons of FRET sensors and relocation sensors in the introduction and we show a quantitative comparison of this new relocation biosensor as compared to existing relocation biosensors (figure 2). The advantage of the relocation sensor relative to a FRET sensor is highlighted in the discussion: “Furthermore, the relocation sensor requires confocal microscopy or TIRF microcopy to spatially separate the bound from unbound probe, whereas FRET measurements are usually performed with widefield microscopes. However, the former mentioned techniques usually offer the higher resolution. Here we presented previously unachieved visualization of Rho activity at subcellular resolution. We observed local activation of Rho at the Golgi which was not possible with the DORA RhoA FRET sensor (Van Unen et al., 2015), indicating a higher sensitivity of the relocation sensor.”

      Minor points:

      • Overall, scale bars should have to be included in HeLa cells microscopy images.

      We will provide the width of the image in the figure captions.

      It was not clear until the Methods section that the widefield analysis appeared to be normalized against another fluorescent protein-based cytoplasmic signal to correct for variations in cell volume. I think this point should be mentioned in the main text more prominently and emphasized so that readers are not misled.

      The normalization of time traces has been done to account for differences in the initial intensity (e.g. due to differences in expression level), this is now better explained: “The mean gray value or cell area respectively, were normalized by dividing each value by the value of the first frame, to account for differences in the initial intensity.” Of note, there is no extra cytoplasmic signal to correct for variations in cell volume.

      • p. 9 : "Anillin AH+PH sensor" instead of "Anillin AHD+PH sensor"

      Corrected.

      • Fig 2B and 2D : Explain what parameter is used for the normalization of each signals ?

      We state in the methods: “ The mean gray value or cell area respectively, were normalized by dividing each value by the value of the first frame, to account for differences in the initial intensity.”

      • Fig. 1A, top panel: it would be good to know which images correspond to the addition of histamine and which ones correspond to the addition of pyrilamine

      The time line with the grey bars indicating the stimulus of the graph matches the images. We changed the legend to clarify: “The images match with the perturbation that is indicated for the plot in panel C.”

      • "TRIF microscopy" is written in legends of Fig. 6 and of Supplemental movie 11, and in Materiel and Methods section p. 23
      • Fig. 3 legend: Correct "mScralet-I-1xrGBD"
      • Fig 4F, legend: " Anillin and the bound RhoA are depicted in dark and light yellow, respectively. PKN1 and the bound RhoA are depicted in light and dark blue, respectively." Color codes in legend are opposites to the figure ones.
      • p.11 : "To examine this, we used a rapamycin-induced hetero dimerization system to recruit the dbl homology (DH) domain, of the RhoA activating GEF p63, to the membrane of the Golgi apparatus." Corresponding references should be included.

      Thanks for pointing these out, all have been addressed/corrected.

      Fig. 5A : Explain FRB, Fig 5C : no unit for a ratio

      We changed the legend “A) Still images of HeLa cells expressing FRB (part of rapamycin hetero-dimerization system) anchored to the membrane, Golgi and mitochondria (first column), FKBP-p63-DH (counterpart of rapamycin hetero-dimerization system, not shown), localization of the dimericTomato-2xrGBD sensor pre activation (second column) and post activation with 100 nM rapamycin (third column).”

      Reviewer #1 (Significance (Required)):

      Mahlandt et al. optimized and compared several G protein binding domain (GBD)-based biosensors in order to improve the potential of existing RhoA-domain-based biosensors for visualizing and reporting RhoA subcellular activity in living cells and tissue. The authors demonstrate that fusing a dimerizing fluorescent protein to the rhotekin GBD (rGBD) is an efficient strategy to increase the brightness of the sensor. The use of Rhotekin-RBD as affinity domain for Rho-class of GTPase is very well established, both in the methods of affinity pulldowns and in biosensor designs for Rho-class of GTPases in the field. The authors show that the dimericTomato-2xrGBD biosensor can indicate endogenous RhoGTPase spatial activity in dividing HeLa cells and during cell retraction of human endothelial cells.

      The dimericTomato-2xrGBD biosensor is thus introduced and described as a RhoA localization-based biosensor, however no experimental data demonstrate the binding specificity of the biosensor for RhoA. Moreover, authors discuss about a previous work showing that rGBD binds the three paralogs RhoA, RhoB and RhoC. This point and the apparent singular claim of this biosensor reporting RhoA activity as this manuscript alludes to are inappropriate and misleading.

      We apologize for the misconception that this probe is specific for RhoA. We do not want to claim this sensor is specific for RhoA (and note that we have been involved in generating FRET biosensors for the different isoforms, RhoA/B/C ourselves: https://doi.org/10.1038/srep25502). We have addressed this in the introduction, and we have changed RhoA to Rho to better reflect that we are looking at all three isoforms.

      This point especially in light of the field has moved on in the past 20 years to assign more specificity (not less) to which GTPase the biosensors are being specific, i.e., via FRET, etc., significantly tempers the enthusiasm of this reviewer. In addition to this main issue, the incomplete characterization of the relative affinities of the domain to the target GTPase isoforms and of the dimerization affinities of the fluorescent proteins (which could change the apparent reaction rate constants), and the impact of which on the reversibility, oligomerization states and detection sensitivity, and the biology, also appeared lacking. Additional stoichiometric considerations and apparent reaction equilibrium that are impacted by the relative concentrations of interacting moieties require careful and further analyses, study and discussion. In general, I think that this work could be interesting to a more specialized field audience with further analyses of the affinities of the interacting moieties and better characterization of the behavior of this biosensor in living cells since it is likely causing oligomerization of the signaling units due to the forced dimerization of the detection unit.

      **Referees cross-commenting**

      This is a dimerizing probe. It gets pretty bulky. Is dimerization occurring prior to GTPase binding or after? Is the dimerized probe/GTPase complex somehow more stable than would otherwise be if they were monomeric? If so, how would that affect the lifetime of the detection and also the diffusivity of the probe("s", if already dimerized) and possibly the whole oligomer?

      dTomato is shown to be a strong, obligate dimer. Therefore, we assume that the fluorescent probe is present as a dimer before (and after) binding to the GTPase. With respect to size/bulkiness we’d like to note that the biosensor is only somewhat larger than a FRET sensor, i.e 2x47 kDa and 74 kDa, respectively.

      It still feels to me that, yes new brighter fluorescent proteins were used, and dimerization and multimerization of the signaling complex increased the SNR of the system, but the whole premise just reverted the biosensor field back 20yrs, which has been my biggest single concern regarding this paper.

      This evaluation is in our opinion largely based on the misconception that we claim RhoA specificity. We do not claim that this sensor is specific for RhA (and we have revised the manuscript accordingly) and we are not aiming to replace FRET sensors (being quite fond of FRET sensors as is clear from our previous work). We think that there is ample opportunities and applications for the improved relocation sensor (as is also evident from requests for the plasmids that encode the probe), for instance in experiment were FRET sensors are challenging to use, such as optogenetics experiments and multiplexing biosensors. We state in the discussion: “Single color relocation sensors are ideal candidates for multiplexing experiments. Plus, the growing field of optogenetics is in need of single color biosensors to detect the effect of optogenetic perturbations. The conventional CFP-YFP FRET sensor is incompatible with most, blue light induced optogenetic tools.”

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      **Summary:**

      Visualization of subcellular activity of GTPases is critical for the understanding of signal transduction of cell growth, differentiation, morphogenesis, etc. For this purpose, researchers often use relocation probes, which comprise a fluorescent protein(s) and a GTPase-binding domain(s), and move from cytosol to the location of active GTPases. The authors improved a previously reported RhoA probe with a strategy of increasing the avidity of RhoA-binding domain and optimizing the fluorescent protein. In the beginning, the authors declare "the relocation of the original, single rGBD monomeric fluorescent protein sensor is hardly detectable" in HeLa cells. To overcome this problem, they developed six constructs by changing the number of rGBD (rhotekin GBD) domains and fluorescent proteins. They found that the increase in the number of rGBD and a dimeric prone fluorescent protein, tdTomato, generate a better probe for RhoA activity. The specificity was examined by using active Rac1 and Cdc42 proteins. Different RhoA-bind domains derived from Rhotekin, PKN1, and Anillin were compared to show the superiority of rhotekin GBD. Finally, they show that subcellular RhoA activation detected by the probe is consistent with the knowledge on RhoA activation by using vascular endothelial cells. Overall this work has been well done in an organized way and disclose a novel RhoA probe that will be useful in future research of RhoA.

      **Major comments:**

      Reproducibility: The number of analyzed cells is described in the legend, but the number of independent experiments is not shown. This is critical to evaluate the reproducibility of the data. Preferably, the data should be presented to show data set derived from each trial clearly. It should also be described how cells were selected for the analysis? It is also preferable to apply automatic analysis. Ideally, the raw data with code sets for analysis should be presented.

      We will indicate the independent experiments. ROIs were partly drawn by hand. We agree that segmentation based methods would increase reproducibility, but this data set is not suitable for automated analysis.

      1. A serious defect of the relocation probe is the dependency on the expression level. The lower the number of the probe in a cell, the higher the fraction of recruited to active RhoA. However, lowering the probe concentration will be accompanied by dim fluorescence. The authors should describe how the optimal expression level was achieved.

      We fully agree. Using the low expression promoter improved the dynamic range but we have not gained control over the optimal expression level. It does vary from cell to cell. We added this paragraph to the discussion: “However, the optimal expression level is crucial for the dynamic range of the relocation sensor. Low concentrations of the sensor will show higher levels of relocalization, as a larger fraction of the sensor molecules binds the limited, active, endogenous Rho molecules. Nevertheless, if the concertation of sensor is too low, the fluorescent signal cannot be detected. To optimize the expression level, the CMVdel promoter, leading to a lower expression level, was applied (Watanabe and Mitchison 2002). Even though, this minimal promoter improved the performance of the relocations sensor, a variety of expression levels was observed. Cell sorting could be applied to select for cells with the optimal expression level.”

      1. Statistical analysis is absent throughout the paper.

      We will add standard deviations to the dot plots.

      **Minor comments:**

      In Figure 1, mNeonGreen (mNG) was used as the fluorescent protein fused to rGBD instead of EGFP, which was used in the original paper. For a fair comparison with the previous report, analysis using the original probe, i.e., EGFP-rGBD, is desirable. Or, the author may simply tone done.

      That is a good point. We propose to perform the HeLa cell histamine stimulation assay for the eGFP-rGBD sensor and add the data to Figure 1B.

      1. In the introduction, it says " The RhoA FRET sensors achieve subcellular resolution to a certain extent, but due to their design they do not localize as endogenous RhoA". Reference is required.

      We changed the following in the introduction: The RhoA FRET sensors achieve subcellular resolution to a certain extent, but due to their design they may not localize as endogenous RhoA (Michaelson et al., 2001).

      1. rGBD should be rhotekin GBD. It should be clearly stated in the beginning.

      We wrote in the introduction: “Secondly, the rhotekin G protein binding domain (rGBD)-based eGFP-rGBD Rho sensor, that was reported in 2005 (Benink & Bement, 2005).” and in the results “ The eGFP-rGBD biosensor consists of an enhanced green fluorescent protein (eGFP) and a rhotekin G protein binding domain (rGBD).”

      1. The reason why the CMVdel promoter is used should be stated clearly.

      Thanks for the suggestion. We added to the discussion: “However, the optimal expression level is crucial for the dynamic range of the relocation sensor. Low concentrations of the sensor will show higher levels of relocalization, as a larger fraction of the sensor molecules binds the limited, active, endogenous Rho molecules. Nevertheless, if the concertation of sensor is too low, the fluorescent signal cannot be detected. To optimize the expression level, the CMVdel promoter, leading to a lower expression level, was applied (Watanabe and Mitchison 2002). Even though, this minimal promoter improved the performance of the relocations sensor, a variety of expression levels was observed. Cell sorting could be applied to select for cells with the optimal expression level.”

      1. Page 23: TRIF should read as TIRF.

      Corrected

      1. Figures: Grey letters should be avoided.

      We will verify the figures for readability

      1. Fig. 3A: Apparently the probe binds to Rac1 G12V to some extent. The discrepancy of RhoA localization between mSca-1xrGBD and dt-2xrGBD must be discussed. This observation clearly suggests that GBD may change the localization of RhoA. It is interesting to note that Rac1 and RhoA may localize to the nucleolus.

      We have changed the text to make clear that the dTomato-2xrGBD binds better to RhoA than the 1xrGBD variant: “Comparing the original single rGBD sensor (mScarlet-I-1xrGBD) with the dimericTomato-2xrGBD sensor, a higher nuclear to cytosolic intensity ratio for the multi-domain sensor was detected, supporting its higher affinity for RhoA.”

      Reviewer #2 (Significance (Required)):

      1. This work discloses an improved RhoA probe, which will be welcome by the researchers in the field of small GTPases.

      We are glad that the reviewer shares our enthusiasm

      1. Novelty of increased GBD: The idea of increasing the GTPase-binding domain in the relocation probe was reported some time ago: Augsten et al., Live-cell imaging of endogenous Ras-GTP illustrates predominant Ras activation at the plasma membrane. EMBO Rep. 7, 46-51 (2006).

      Agreed - we added the reference to the discussion: “This strategy, to utilize multiple repeating domains has also been effective for a PH domain based lipid sensor and a cRAF derived Ras-binding domain Ras activity sensor (Augsten et al., 2006; Goulden et al., 2018)”

      1. Novelty of rhotekin GBD: The reason why GBD of PKN is chosen in intramolecular FRET biosensors such as DORA and Raichu is that the affinity of other GBD's is too high [Table 1, Yoshizaki et al., J. Cell Biol. 162, 223-232 (2003)]. Judging from this old data, GBD's of mDia and Rhophilin, may work better than that of Rhotekin. Moreover, it is known that PH domain may be required for proper conformation of GBD's. Thus, it is not surprising that removal of PH domain from the Anillin probe abolishes its translocation ability. Therefore, to the reviewer's eyes, the choice of GBD in Figure 4 is biased to those that will work less efficiently.

      We see the point, but we have chosen these (PKN/anillin) for a practical reason, namely that we had cDNA encoding these probes in our lab. We thank the reviewer for the suggestion to look into other GBDs.

      1. Authors' proposal of "systematic optimization" sounds exaggerated, considering the small number of constructs tested in Fig. 1 and Fig. 4. Similarly, it is not clear whether dimerize prone-fluorescent proteins are better choice by simply comparing tdTomato and mNeonGreen.

      Fair enough, we think of it as a systematic comparison (figure 1) and we have rephrased the sentence: “Improving the rGBD probe by increasing the avidity was successful”

      1. Keywords of expertise: Fluorescent probes. Cell signaling.

      **Referess cross-commenting**

      Because Review Commons does not specify the journal to be published, the request by the Reviewer #1 sounds too much. The probe reported in this work deserves publishing, although it may not be a ground-breaking probe.

      We thank the reviewer for the encouraging words and support.

      Reading the comments by the other reviewers, following concerns should be cleared.

      1.Relationship between the probe's concentration and the response.

      2.Specificity to RhoA, RhoB, and RhoC

      3.The effect of the cell morphology as pointed by Reviewer #1.

      Concern 1 will be addressed by re-analysis of the data. Concern 2 is addressed by changes in the text, was we have indicated in our response. Concern 3 will be addressed by control experiments that look into changes in cell morphology

      To Reviewer #1

      -Since equimolar distribution of the moieties are not guaranteed, this affects the detection characteristics of this biosensor. This point should be discussed and emphasized The probe will diffuse rapidly within cytosol. Therefore, subcellular concentration of the probe may not affect significantly on the performance of the probe.

      -What is the effect of histamine stimulation on dT2xrGBD biosensor response when this one is forced to be located in other subcellular compartments (mitochondria, nucleus) by fusing the construct to targeting sequences. I did not understand this question quite well.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      **Summary**

      In this paper, Mahlandt et al compared and improved relocation sensors to visualize the activity of endogenous Rho. As a result of screening for several Rho binding domains (GBDs) and the number of repeats, the authors found that dTomato-2xrGBD is optimal, and succeeded in visualizing the activity of Rho during cytokinesis and migrating cells. Overall, this sensor would be a useful tool for many cell biologists. The data are represented clearly in the figures. I provide some concerns; that would be worth addressing in a revised version.

      **Major comments**

      1. The authors should experimentally show the quantitative relationship between biosensor expression level and degree of relocation. In principle, this relocation type sensor binds to the endogenous GTP-bound Rho. Since the number of endogenous GTP-bound Rho is limited in cells, the degree of relocation is considered to be dependent on the expression level of the sensor. If the number of biosensors expressed is too small in a cell, the response will be saturated. If the number of biosensors is too large, the relocation will be weakened and the Rho signal will be suppressed. Furthermore, although a weak promoter is used, the heterogeneity of the expression level in each cell makes quantitative analysis difficult, especially in transient expression experiments. I would like to suggest the addition of quantitative experimental data.

      We propose to re-analyze of our data, indicating the relative expression levels of the biosensor (based on intensity) in the dot plots. We agree that the expression level potentially affects sensor performance and we will address this more clearly in the text We added to the introduction: “A potential drawback is that background signal of the unbound biosensor in the cytosol, which may occlude the bound pool and reduce the dynamic range.” We added to the discussion: “However, the optimal expression level is crucial for the dynamic range of the relocation sensor. Low concentrations of the sensor will show higher levels of relocalization, as a larger fraction of the sensor molecules binds the limited, active, endogenous Rho molecules. Nevertheless, if the concertation of sensor is too low, the fluorescent signal cannot be detected. To optimize the expression level, the CMVdel promoter, leading to a lower expression level, was applied (Watanabe and Mitchison 2002). Even though, this minimal promoter improved the performance of the relocations sensor, a variety of expression levels was observed. Cell sorting could be applied to select for cells with the optimal expression level.”

      1. Most of the time-series data show only a representative example, namely, N = 1. In relation to the aforementioned issue, data and distribution derived from several cells (e.g. SD) should be shown in a clear manner.

      We focused not primarily on the kinetics, but more on maximal relocation, therefore we do not have time lapse movies for all the shown data points (e.g. a time lapse is shown in 1C and the data for a higher number of cells is shown in 1B). However, we can provide time series for multiple cells from our existing data sets.

      **Minor comments**

      1. I hesitate to call the biosensor developed in this study "RhoA sensor". This is because, as the authors mention, it has been reported that the rGBD also binds to RhoB and RhoC. If the authors call it a RhoA sensor, they should investigate the specificity of binding to RhoB and RhoC in addition to RhoA. If not, I would like to suggest changing the name to "Rho sensor" instead of "RhoA sensor".

      This is a fair point, also made by other reviewers. We will change the name to Rho sensor.

      Reviewer #3 (Significance (Required)):

      Rho is one of the low molecular weight G proteins, which regulate the reorganization of the actin cytoskeleton. As biosensors for visualizing the activity of Rho proteins, it has been reported intramolecular and intermolecular FRET biosensors and relocation sensors. The latter is less widely used than the former, because of insufficient sensitivity and specificity. Therefore, the improvement of Rho biosensors is really important and needed in the community of cell biology research field. The importance of this manuscript, I believe, is that the authors compared the existing relocation type Rho sensors. This is informative.

      Rho is one of the low molecular weight G proteins that regulate the rearrangement of the actin cytoskeleton. Intramolecular and intermolecular FRET biosensors and relocation sensors have been reported as biosensors for visualizing the activity of Rho proteins. The latter is not as widely used as the former due to its inadequate sensitivity and specificity. Therefore, improving the Rho biosensor is very important and is needed by the community in the field of cell biology research. I believe the importance of this manuscript is that the author compared existing relocation-type Rho sensors. This is beneficial and informative.

      My expertise: Cell biology, live-cell imaging, development of genetically encoded fluorescent probes

      We thank the reviewer for the positive evaluation of our work.

      **Referees cross-commenting**

      I generally agree with Reviewer 2's opinion. The opinions of our three reviewers can be summarized in three points: expression level, specificity, and statistical analysis and representation. I think these should be asked to the authors as major critics that should be addressed before publication.

      We agree and we propose to address the three main points (see also response to reviewer 2).

      Reviewer #4 (Evidence, reproducibility and clarity (Required)):

      **SUMMARY:**

      Mahlandt and colleagues use advanced microscopy techniques to test new configurations of several Rho relocation sensors, which report on the activity of members of the endogenous RhoA GTPase family of proteins. A novel variant containing the dimericTomato fluorescent protein and a double rGBD domain shows a substantial increase in dynamic range in comparison with 2 originally published sensors and other new variants they tested. They use a cellular assay to show that this novel variant is specific for the activity of Rho family of Rho GTPases and not the Cdc42/Rac families. Finally, the authors show that this new variant can be used to measure a specific localised increase of Rho activity at the Golgi, and during cell division and cellular morphology changes that are known to activate the RhoA family of Rho GTPases. The biosensor can be useful for the community. However, I think the paper is not well written (I was very confused by several statements). The manuscript should be thoroughly proofread, there are quite some unclear or duplicate passages (for examples, see "text comments" below). Currently this hampers the interpretation of the manuscript for the reader. The authors are very dogmatic - they make claims about the literature that I do not agree with at all. Some of these unbalanced views will confuse the non-expert readers.

      **MAJOR COMMENTS:**

      -The reported dTomato-rGBD sensor is unable to distinguish between the different members of the RhoA familiy of Rho GTPases (measures combined activity of RhoA, RhoB and RhoC), which is unclear for the reader in the current text phrasing in the introduction. The authors seemingly suggest throughout the manuscript to work with a specific RhoA biosensor, which is not the case. This strong statement is completely misleading. The authors need to refer to the biosensor being specific for Rho (RhoA,B,C) GTPases versus Rac1/Cdc42 biosensors, and discuss what this means for the field. Some discussions about this are made in a JCB paper by Graessl et al, that the authors also cite.

      We agree that the probe measures the combined activity of all three isoforms and apologize for the confusion. We have changed the name to Rho sensor and updated the manuscript.

      -If the authors really want to sell that the biosensor is only specific for RhoA, then they need to make a series of experiments with RhoB and RhoC dominant positive/negative constructs, to tackle that specific point.

      No, we did not intend to claim the sensor is specific for RhoA in comparison to Rho B and C.

      -Did the authors consider to use the artificial GBD from Keller, 2019 to make a specific relocation sensor for RhoA? Perhaps the authors can comment on the feasibility of this approach?

      We think that this might be the only way to make a specific RhoA relocation sensor. Recently, we have received the DNA and plan to do the histamine stimulation experiment in HeLa cells as in Figure 1B.

      -A strong (dogmatic) statement is that Rho GTPases FRET sensors report solely on the activity of GEFs. This is not the case, these sensors report on the flux of GAP and GEF activity for Rho GTPase in cells. This is also true for relocation sensors, and has been documented in work from the Bement/Pertz/Nalbant/Dehmelt labs.

      We thank the referee for this correction and we have changed the text to: “By design, these FRET sensors report on the balance between activating guanine exchange factors (GEFs) and inactivating GTPase-activating proteins, instead of visualizing endogenous RhoA-GTP”

      -From the data in Figure 1, it seems to follow that the efficiency of PM relocation is mainly determined by the number of rGBD modules on the sensors. Could the authors speculate on how this works in practice; is the multi-rGBD sensor increasingly kinetically trapped by a single RhoA molecule, or is the sensor mostly bound to multiple RhoA molecules at the PM?

      This is an interesting question to which we do not have an answer. We added some text to the discussion: “It is currently not clear how each of the GBDs of the dimericTomato-2xrGBD sensor contribute to Rho binding and the probe may bind anywhere between 1 and 4 Rho molecules. If the probe is capable of binding multiple Rho proteins, the binding efficiency will depend on the local density of Rho in the membrane. “

      -Some form of statistical analysis should be performed on the data to give the reader a sense of robustness of the findings and its uncertainty. Either a non-parametric test on the median, confidence intervals or e.g. boxplots showing notches.

      We will include standard deviations in our dot plots.

      -Time-series now show single example traces (fig1C, fig2B,D, fig5B). It would be informative for the reader if the curves of all experiments were plotted, and statistical analysis would be performed on the data. It is unclear how representable the kinetics in these curves are.

      We can show the kinetics for more examples but we did not acquire time lapses for all the data points shown in the dot plots, since the microscope could not move fast enough to acquire frames with an interval of 10 -20 s.

      -About the spatial patterns of Rho activity (cytokinesis, tail retraction, ...), the reviewers agree that statistical analysis is much more difficult. But maybe showing 2-3 cells instead of only one, would make the data more convincing.

      We will provide more examples.

      **MINOR COMMENTS:**

      -(fig4a) dTomato-2xpGBD, why is this not good? how is it possible that it binds good to nucleus, but no translocation is observed? const activity? expression levels?

      We were surprised and somewhat disappointed by this as well and we do not have an explanation, besides that the binding affinity required for dynamic relocation seems to be higher than the one for binding the overexpressed active Rho GTPase.

      -(fig4f) The aGBD/pGBD binding sites for RhoA show great overlap but bind to completely different sites at RhoA, is this correct? (color scheme used for the structures is not easily interpretable)

      It is correct they both have two binding sites but apparently, they found crystals for one or the other. Maesaki et al. 1999 is describing the two binding site. We will change the colors.

      -(fig5) Unclear how the intensity at the specific organelles is measured? were the organelles segmented or hand-drawn ROI based? The quantified difference is very small, no statistics are performed, and it is unclear how it was measured. This is currently weak evidence for the main claim in this subsection.

      ROIs are drawn by hand. We will provide standard deviations in our dot plots.

      -(fig5) The kinetics of the response to histamine (fig1C) seems to be much faster as the rapamycin mediated increase in fig5B for the PM condition. Any explanation for this? Why does it not reach a plateau like in the histamine experiments?

      It is probably the recruitment of the p63-DH that takes more time than the activation of the H1R and the downstream signaling. We have the data of the p63-DH recruitment channel so we will check the recruitment kinetics of the p63-DH to the membrane.

      -(fig6F) Data from 6D is repeated here, 6F could potentially show aggregate time-series instead of individual cells. Would also improve interpretation if the membrane marker curve is plotted in every subfigure. Potentially membrane marker intensity could be used to normalise the (TIRF) measurements?

      We will include the data of the membrane intensity for every trace in F.

      -can the authors provide scale bars on the micrographs, as is usually done in any manuscript ? It would also be useful to put time labels when images corresponding to timeseries are shown.

      We will provide the width of the image in the figure captions.

      -ratio values are dimensionless by definition, so no need to write "arbitrary units"

      We will change that.  

      **TEXT COMMENTS:**

      -(abstract): "Due to the improved avidity of the new biosensors for RhoA activity, cellular processes regulated by RhoA can be better understood." -> unclear what the authors mean with 'avidity' in this context? (here, and throughout rest the manuscript)

      Avidity refers to “the accumulated strength of multiple affinities”, we added this explanation to the text in the introduction. Another paper working with multiple biding domains to improve a relocation sensors also calls it avidity: A high-avidity biosensor reveals plasma membrane PI(3,4)P2 is predominantly a class I PI3K signaling product (Goulden at al. 2018 JCB).

      -(introduction) "Although these three Rho GTPases may have different functions, we generally refer to RhoA in this manuscript." -> unclear what message the authors try to convey with this sentence.

      We changed to: “We will use ‘Rho’ throughout the manuscript, which refers to all three isoforms”

      -(introduction) "Active RhoA mainly localizes at the plasma membrane, due to its prenylated C-terminus" -> where else would it be localised? Where is inactive RhoA localised?

      We included: “Active Rho mainly localizes at the plasma membrane, due to its prenylated C-terminus (Garcia-Mata et al., 2011).However, a fraction of RhoA has been found at the Golgi apparatus. Inactive RhoA, in comparison, can be extracted from the plasma membrane by Rho-specific guanine nucleotide dissociation inhibitors (RHOGDIs) (Garcia-Mata et al., 2011)”.

      -(introduction) "Unimolecular Rho GTPase FRET-based biosensors consist of the Rho GTPase itself, a GBD and a FRET pair." -> a short description/explanation of what a "FRET pair" is would benefit the non-specialised audience.

      We included: “Unimolecular Rho GTPase FRET-based biosensors consist of the Rho GTPase itself, a GBD and a FRET pair, which is commonly a cyan and a yellow fluorescent protein.”

      -(Results p9) "For the original Anillin AH+PH sensor...around 15%" -> did the authors do the experiment with G14V on this original sensor variant?

      Yes, it is supposed to say AHD+PH here as well, which has been corrected. We performed the experiment with mScarlet-AHD-PH.

      -(Results p9) The "mScarlet-I-AHD+PH" seems to perform quite good on the fig4D assay, but is not present in 4C analysis?

      eGFP-AHD+PH was used as the original sensors for the 4C assay. Due to the color of the RhoA G14V (mTq2) we switched to the mScarlet version to exclude bleed through. We assume that the sensor performs similar with different monomeric fluorescent proteins.

      -(Results p9) "mScarlet-I-AHD+PH" is the same as "AHD+PH (aGBD+C2+PH)"? descriptions unclear. Would generally advise to thoroughly check the manuscript for consistency of condition descriptions / abbreviations in both text and legends.

      Changed to: AHD+PH (consisting of aGBD+C2+PH). We mention earlier: “Moreover, a published relocation sensor AHD+PH based on Anillin contains, next to a G protein binding domain, also a C2 and a PH domain and localizes in punctuate structures which do not represent Rho activity (Figure 2C,Supplemental Movie 4 and 5) (Munjal et al., 2015; Piekny & Glotzer, 2000). Here, we used only the G protein binding domain of Anillin (aGBD) as a basis for another sensor.”

      -(Results p12) "Visualizing endogenous RhoA activity" as subsection title could potentially confuse readers, since all measured Rho activity in the manuscript is endogenous.

      That could indeed be confusing. What we intending to highlight is that we did not overexpress any signaling molecules or receptors in these experiments. We changed the title to: “Visualizing endogenous Rho activity under physiological conditions”

      **minor text:**

      -(fig3b legend) "mScralet-I-1xrGBD"

      Corrected

      -(fig6H legend) "TRIF", and "cbBOEC" is same as "BOEC"?

      It is a detail, but these are indeed different and we have updated the materials and methods to better reflect this: “cord blood Blood Outgrowth Endothelial cells (cbBOEC)” and “Blood Outgrowth Endothelial cells from healthy adult donor blood (BOEC)”

      Reviewer #4 (Significance (Required)):

      The novel "Rho" family GTPase relocation sensor that the authors present might be a significant improvement over the currently existing ones (for refs, see manuscript). This might provide a substantial technical advance in the field and increases the utilisation and the reproducibility of this tool in the field. This sensor will be of significant interest for the Rho GTPase signalling field, and more broader the cytoskeleton biology community. My expertise in Rho GTPase biology, biosensor development and advanced microscopy granted me the opportunity to judge the complete manuscript

      The reviewer thinks that the new sensor will be of significant interest and we agree.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #4

      Evidence, reproducibility and clarity

      SUMMARY:

      Mahlandt and colleagues use advanced microscopy techniques to test new configurations of several Rho relocation sensors, which report on the activity of members of the endogenous RhoA GTPase family of proteins. A novel variant containing the dimericTomato fluorescent protein and a double rGBD domain shows a substantial increase in dynamic range in comparison with 2 originally published sensors and other new variants they tested.<br> They use a cellular assay to show that this novel variant is specific for the activity of Rho family of Rho GTPases and not the Cdc42/Rac families. Finally, the authors show that this new variant can be used to measure a specific localised increase of Rho activity at the Golgi, and during cell division and cellular morphology changes that are known to activate the RhoA family of Rho GTPases. The biosensor can be useful for the community. However, I think the paper is not well written (I was very confused by several statements). The manuscript should be thoroughly proofread, there are quite some unclear or duplicate passages (for examples, see "text comments" below). Currently this hampers the interpretation of the manuscript for the reader. The authors are very dogmatic - they make claims about the literature that I do not agree with at all. Some of these unbalanced views will confuse the non-expert readers.

      MAJOR COMMENTS:

      -The reported dTomato-rGBD sensor is unable to distinguish between the different members of the RhoA familiy of Rho GTPases (measures combined activity of RhoA, RhoB and RhoC), which is unclear for the reader in the current text phrasing in the introduction. The authors seemingly suggest throughout the manuscript to work with a specific RhoA biosensor, which is not the case. This strong statement is completely misleading. The authors need to refer to the biosensor being specific for Rho (RhoA,B,C) GTPases versus Rac1/Cdc42 biosensors, and discuss what this means for the field. Some discussions about this are made in a JCB paper by Graessl et al, that the authors also cite.

      -If the authors really want to sell that the biosensor is only specific for RhoA, then they need to make a series of experiments with RhoB and RhoC dominant positive/negative constructs, to tackle that specific point.

      -Did the authors consider to use the artificial GBD from Keller, 2019 to make a specific relocation sensor for RhoA? Perhaps the authors can comment on the feasibility of this approach?

      -A strong (dogmatic) statement is that Rho GTPases FRET sensors report solely on the activity of GEFs. This is not the case, these sensors report on the flux of GAP and GEF activity for Rho GTPase in cells. This is also true for relocation sensors, and has been documented in work from the Bement/Pertz/Nalbant/Dehmelt labs.

      -From the data in Figure 1, it seems to follow that the efficiency of PM relocation is mainly determined by the number of rGBD modules on the sensors. Could the authors speculate on how this works in practice; is the multi-rGBD sensor increasingly kinetically trapped by a single RhoA molecule, or is the sensor mostly bound to multiple RhoA molecules at the PM? -Some form of statistical analysis should be performed on the data to give the reader a sense of robustness of the findings and its uncertainty. Either a non-parametric test on the median, confidence intervals or e.g. boxplots showing notches.

      -Time-series now show single example traces (fig1C, fig2B,D, fig5B). It would be informative for the reader if the curves of all experiments were plotted, and statistical analysis would be performed on the data. It is unclear how representable the kinetics in these curves are.

      -About the spatial patterns of Rho activity (cytokinesis, tail retraction, ...), the reviewers agree that statistical analysis is much more difficult. But maybe showing 2-3 cells instead of only one, would make the data more convincing.

      MINOR COMMENTS:

      -(fig4a) dTomato-2xpGBD, why is this not good? how is it possible that it binds good to nucleus, but no translocation is observed? const activity? expression levels?

      -(fig4f) The aGBD/pGBD binding sites for RhoA show great overlap but bind to completely different sites at RhoA, is this correct? (color scheme used for the structures is not easily interpretable)

      -(fig5) Unclear how the intensity at the specific organelles is measured? were the organelles segmented or hand-drawn ROI based? The quantified difference is very small, no statistics are performed, and it is unclear how it was measured. This is currently weak evidence for the main claim in this subsection.

      -(fig5) The kinetics of the response to histamine (fig1C) seems to be much faster as the rapamycin mediated increase in fig5B for the PM condition. Any explanation for this? Why does it not reach a plateau like in the histamine experiments?

      -(fig6F) Data from 6D is repeated here, 6F could potentially show aggregate time-series instead of individual cells. Would also improve interpretation if the membrane marker curve is plotted in every subfigure. Potentially membrane marker intensity could be used to normalise the (TIRF) measurements?

      -can the authors provide scale bars on the micrographs, as is usually done in any manuscript ? It would also be useful to put time labels when images corresponding to timeseries are shown.

      -ratio values are dimensionless by definition, so no need to write "arbitrary units"

      TEXT COMMENTS:

      -(abstract): "Due to the improved avidity of the new biosensors for RhoA activity, cellular processes regulated by RhoA can be better understood." -> unclear what the authors mean with 'avidity' in this context? (here, and throughout rest the manuscript)

      -(introduction) "Although these three Rho GTPases may have different functions, we generally refer to RhoA in this manuscript." -> unclear what message the authors try to convey with this sentence.

      -(introduction) "Active RhoA mainly localizes at the plasma membrane, due to its prenylated C-terminus" -> where else would it be localised? Where is inactive RhoA localised?

      -(introduction) "Unimolecular Rho GTPase FRET-based biosensors consist of the Rho GTPase itself, a GBD and a FRET pair." -> a short description/explanation of what a "FRET pair" is would benefit the non-specialised audience.

      -(Results p9) "For the original Anillin AH+PH sensor...around 15%" -> did the authors do the experiment with G14V on this original sensor variant?

      -(Results p9) The "mScarlet-I-AHD+PH" seems to perform quite good on the fig4D assay, but is not present in 4C analysis?

      -(Results p9) "mScarlet-I-AHD+PH" is the same as "AHD+PH (aGBD+C2+PH)"? descriptions unclear. Would generally advise to thoroughly check the manuscript for consistency of condition descriptions / abbreviations in both text and legends.

      -(Results p12) "Visualizing endogenous RhoA activity" as subsection title could potentially confuse readers, since all measured Rho activity in the manuscript is endogenous.

      minor text:

      -(fig3b legend) "mScralet-I-1xrGBD"

      -(fig6H legend) "TRIF", and "cbBOEC" is same as "BOEC"?

      Significance

      The novel "Rho" family GTPase relocation sensor that the authors present might be a significant improvement over the currently existing ones (for refs, see manuscript). This might provide a substantial technical advance in the field and increases the utilisation and the reproducibility of this tool in the field. This sensor will be of significant interest for the Rho GTPase signalling field, and more broader the cytoskeleton biology community. My expertise in Rho GTPase biology, biosensor development and advanced microscopy granted me the opportunity to judge the complete manuscript

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Summary

      In this paper, Mahlandt et al compared and improved relocation sensors to visualize the activity of endogenous Rho. As a result of screening for several Rho binding domains (GBDs) and the number of repeats, the authors found that dTomato-2xrGBD is optimal, and succeeded in visualizing the activity of Rho during cytokinesis and migrating cells. Overall, this sensor would be a useful tool for many cell biologists. The data are represented clearly in the figures. I provide some concerns; that would be worth addressing in a revised version.

      Major comments

      1. The authors should experimentally show the quantitative relationship between biosensor expression level and degree of relocation. In principle, this relocation type sensor binds to the endogenous GTP-bound Rho. Since the number of endogenous GTP-bound Rho is limited in cells, the degree of relocation is considered to be dependent on the expression level of the sensor. If the number of biosensors expressed is too small in a cell, the response will be saturated. If the number of biosensors is too large, the relocation will be weakened and the Rho signal will be suppressed. Furthermore, although a weak promoter is used, the heterogeneity of the expression level in each cell makes quantitative analysis difficult, especially in transient expression experiments. I would like to suggest the addition of quantitative experimental data.
      2. Most of the time-series data show only a representative example, namely, N = 1. In relation to the aforementioned issue, data and distribution derived from several cells (e.g. SD) should be shown in a clear manner.

      Minor comments

      1. I hesitate to call the biosensor developed in this study "RhoA sensor". This is because, as the authors mention, it has been reported that the rGBD also binds to RhoB and RhoC. If the authors call it a RhoA sensor, they should investigate the specificity of binding to RhoB and RhoC in addition to RhoA. If not, I would like to suggest changing the name to "Rho sensor" instead of "RhoA sensor".

      Significance

      Rho is one of the low molecular weight G proteins, which regulate the reorganization of the actin cytoskeleton. As biosensors for visualizing the activity of Rho proteins, it has been reported intramolecular and intermolecular FRET biosensors and relocation sensors. The latter is less widely used than the former, because of insufficient sensitivity and specificity. Therefore, the improvement of Rho biosensors is really important and needed in the community of cell biology research field. The importance of this manuscript, I believe, is that the authors compared the existing relocation type Rho sensors. This is informative.

      Rho is one of the low molecular weight G proteins that regulate the rearrangement of the actin cytoskeleton. Intramolecular and intermolecular FRET biosensors and relocation sensors have been reported as biosensors for visualizing the activity of Rho proteins. The latter is not as widely used as the former due to its inadequate sensitivity and specificity. Therefore, improving the Rho biosensor is very important and is needed by the community in the field of cell biology research. I believe the importance of this manuscript is that the author compared existing relocation-type Rho sensors. This is beneficial and informative.

      My expertise: Cell biology, live-cell imaging, development of genetically encoded fluorescent probes

      Referees cross-commenting

      I generally agree with Reviewer 2's opinion. The opinions of our three reviewers can be summarized in three points: expression level, specificity, and statistical analysis and representation. I think these should be asked to the authors as major critics that should be addressed before publication.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary:

      Visualization of subcellular activity of GTPases is critical for the understanding of signal transduction of cell growth, differentiation, morphogenesis, etc. For this purpose, researchers often use relocation probes, which comprise a fluorescent protein(s) and a GTPase-binding domain(s), and move from cytosol to the location of active GTPases. The authors improved a previously reported RhoA probe with a strategy of increasing the avidity of RhoA-binding domain and optimizing the fluorescent protein. In the beginning, the authors declare "the relocation of the original, single rGBD monomeric fluorescent protein sensor is hardly detectable" in HeLa cells. To overcome this problem, they developed six constructs by changing the number of rGBD (rhotekin GBD) domains and fluorescent proteins. They found that the increase in the number of rGBD and a dimeric prone fluorescent protein, tdTomato, generate a better probe for RhoA activity. The specificity was examined by using active Rac1 and Cdc42 proteins. Different RhoA-bind domains derived from Rhotekin, PKN1, and Anillin were compared to show the superiority of rhotekin GBD. Finally, they show that subcellular RhoA activation detected by the probe is consistent with the knowledge on RhoA activation by using vascular endothelial cells. Overall this work has been well done in an organized way and disclose a novel RhoA probe that will be useful in future research of RhoA.

      Major comments:

      1. Reproducibility: The number of analyzed cells is described in the legend, but the number of independent experiments is not shown. This is critical to evaluate the reproducibility of the data. Preferably, the data should be presented to show data set derived from each trial clearly. It should also be described how cells were selected for the analysis? It is also preferable to apply automatic analysis. Ideally, the raw data with code sets for analysis should be presented.
      2. A serious defect of the relocation probe is the dependency on the expression level. The lower the number of the probe in a cell, the higher the fraction of recruited to active RhoA. However, lowering the probe concentration will be accompanied by dim fluorescence. The authors should describe how the optimal expression level was achieved.
      3. Statistical analysis is absent throughout the paper.

      Minor comments:

      1. In Figure 1, mNeonGreen (mNG) was used as the fluorescent protein fused to rGBD instead of EGFP, which was used in the original paper. For a fair comparison with the previous report, analysis using the original probe, i.e., EGFP-rGBD, is desirable. Or, the author may simply tone done.
      2. In the introduction, it says " The RhoA FRET sensors achieve subcellular resolution to a certain extent, but due to their design they do not localize as endogenous RhoA". Reference is required.
      3. rGBD should be rhotekin GBD. It should be clearly stated in the beginning.
      4. The reason why the CMVdel promoter is used should be stated clearly.
      5. Page 23: TRIF should read as TIRF.
      6. Figures: Grey letters should be avoided.
      7. Fig. 3A: Apparently the probe binds to Rac1 G12V to some extent. The discrepancy of RhoA localization between mSca-1xrGBD and dt-2xrGBD must be discussed. This observation clearly suggests that GBD may change the localization of RhoA. It is interesting to note that Rac1 and RhoA may localize to the nucleolus.

      Significance

      1. This work discloses an improved RhoA probe, which will be welcome by the researchers in the field of small GTPases.
        1. Novelty of increased GBD: The idea of increasing the GTPase-binding domain in the relocation probe was reported some time ago: Augsten et al., Live-cell imaging of endogenous Ras-GTP illustrates predominant Ras activation at the plasma membrane. EMBO Rep. 7, 46-51 (2006).
        2. Novelty of rhotekin GBD: The reason why GBD of PKN is chosen in intramolecular FRET biosensors such as DORA and Raichu is that the affinity of other GBD's is too high [Table 1, Yoshizaki et al., J. Cell Biol. 162, 223-232 (2003)]. Judging from this old data, GBD's of mDia and Rhophilin, may work better than that of Rhotekin. Moreover, it is known that PH domain may be required for proper conformation of GBD's. Thus, it is not surprising that removal of PH domain from the Anillin probe abolishes its translocation ability. Therefore, to the reviewer's eyes, the choice of GBD in Figure 4 is biased to those that will work less efficiently.
        3. Authors' proposal of "systematic optimization" sounds exaggerated, considering the small number of constructs tested in Fig. 1 and Fig. 4. Similarly, it is not clear whether dimerize prone-fluorescent proteins are better choice by simply comparing tdTomato and mNeonGreen.
        4. Keywords of expertise: Fluorescent probes. Cell signaling.

      Referess cross-commenting

      Because Review Commons does not specify the journal to be published, the request by the Reviewer #1 sounds too much. The probe reported in this work deserves publishing, although it may not be a ground-breaking probe.

      Reading the comments by the other reviewers, following concerns should be cleared.

      1.Relationship between the probe's concentration and the response.

      2.Specificity to RhoA, RhoB, and RhoC

      3.The effect of the cell morphology as pointed by Reviewer #1.

      To Reviewer #1

      -Since equimolar distribution of the moieties are not guaranteed, this affects the detection characteristics of this biosensor. This point should be discussed and emphasized The probe will diffuse rapidly within cytosol. Therefore, subcellular concentration of the probe may not affect significantly on the performance of the probe.</i>

      -What is the effect of histamine stimulation on dT2xrGBD biosensor response when this one is forced to be located in other subcellular compartments (mitochondria, nucleus) by fusing the construct to targeting sequences. I did not understand this question quite well.

    5. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Major points:

      • The affinity analyses need more work. This is against A/B/C isoforms, and also the dimerization affinity between the fluorescent proteins could change the apparent on/off rates. This point is not quantified or discussed. Due to the chemical equilibrium analysis, the apparent equilibrium is not only affected by this on/off rates, but also the local availability (concentrations) of the reacting moieties. In the limit where the biosensor concentration is low within a cellular subcompartment or vice versa, how this is going to change the sensitivity of detection because this can push the reaction in either directions. Since equimolar distribution of the moieties are not guaranteed, this affects the detection characteristics of this biosensor. This point should be discussed and emphasized.
      • Fig 1 A: Are the fluorescence changes of the biosensors due to stimulation with histamine completely reversible ? In other words, is it possible to see a total recovery of the signals with pyrilamine or in the presence of another antagonist ? If not, why? Does histamine stimulation induce a maximal activation of RhoA in HeLa cells? What happens in terms of fluorescence changes when the activity of RhoA is inhibited or in the presence of a Gαq-inhibitor, and in conditions in which RhoA activating GEF, RhoA GAP or RhoA GDI is overexpressed ? Generally, I think it is useful to have a calibration curve of the biosensors activity, maximal/minimal (ON/OFF) response. For exemple, it would help to answer the question concerning biosensors binding affinity for RhoA ("The function of rhotekin is not clear, it seems to lock RhoA in the GTP bound state (Ito et al., 2018; Reid et al., 1996). We can only speculate that rhotekin has a stronger binding affinity for active RhoA than anillin and PKN1 have." (p.15)) What is the effect of histamine stimulation on a membrane marker expression/location ? What is the effect of histamine stimulation on dT2xrGBD biosensor response when this one is forced to be located in other subcellular compartments (mitochondria, nucleus) by fusing the construct to targeting sequences. Physiological control: Effect of the presence of the biosensor in cell morphology/behavior... Experimental data concerning this point are evoked in the discussion section. "We demonstrate that low expression of the biosensor, through the truncated CMV promotor, did not inhibit cell division and cell edge retraction. Plus, endothelial cells expressing the sensor still show the typical reaction of contracting followed by spreading, when stimulated with thrombin. Low expression results in a low fluorescent signal of the sensor." (p.16) I think this results would deserve a section in this manuscript.
      • Fig 2D : "The anillin sensor AHD+PH showed a 15% decrease in cytosolic intensity (Figure 2D), but it also relocalizes to striking punctuate structures upon histamine stimulation. These structures did not seem to represent local, high activity of RhoA, as the optimized rGBD sensor in the same cell showed no such locally clustered RhoA activation, but rather a homogenous activation at the membrane and a 60% drop in cytosolic intensity. Similar punctuate structures were observed in endothelial cells, when stimulated with the strong RhoA activator thrombin (Supplemental Movie 5)." And p. 15 : "However, we noticed that the AHD+PH sensor, containing aGBD, C2 and PH domain, localizes in a punctate manner. These 'dots' were observed in both HeLa cells and endothelial cells and were only observed with the AHD+PH RhoA sensor. As aGBD does not localize in puncta, it seems that the localization is caused by domains other than of the RhoA binding domain, i.e. the C2- and/or PH-domain." Punctate structures are also present in HeLa cells expressing the anillin sensor before histamine stimulation (see Supplemental Movie 4). Moreover, punctuate pattern activated by thrombin in endothelial cells looks different (more widespread) than the one activated by histamine in HeLA cells. In addition, these structures can also be found in human endothelial cells expressing dT2xrGBD (fig. 6B, Supplemental movie 10). What are those structures thrombin activated in endothelial cells that would be similar to the ones in Hela cells activated by histamine and that "did not seem to represent local, high activity of RhoA"? This is not further commented by the authors.
      • Fig 3A: "The rGBD sensors solely colocalized in the nucleus with RhoA but not with Rac1 and Cdc42, indicating that rGBD specifically binds constitutively active RhoA." What about dT2xrGBD binding specificity for the three homologues RhoA, RhoB and RhoC? This point is evoked in the discussion part (p.16) but there is no experimental data to support it "The specificity of the relocation sensor is determined by the binding specificity of the GBD. The rGBD binds the three homologues RhoA, B and C but not to Rac1 and Cdc42". So, why rGBD is presented as a RhoA biosensor?
      • Fig 3B: The data scatter for the dTomato-2xrGBD is very wide compared to the mScarlet-1xrGBD. What is causing this wide data scatter and such heterogeneous response? This is a problem if the sensor is really so heterogeneously responding to a strong mutant of RhoA, is this a dimerization-dependent problem?
      • These domain-based biosensors could cause dominant negative/inhibitory artefacts. Also the dimerizing fluorescent proteins could introduce oligomerization of the signaling complex which is not real in cells and clearly affect phenotype. These issues should be tested and addressed by a quantitative measure of cell behavior against increasing concentration/changing dimerization potentials of the biosensor in live cell assays.
      • Fig 4 C: "Given the successful improvement of the rGBD-based biosensor by increasing the number of binding domains, we explored whether the same strategy can be applied to the G protein binding domains from PKN1 and Anillin" and "The dimericTomato-2xrGBD sensor shows the best relocation efficiency, with a median change in cytosolic intensity of close to 50%"... So why the dT-2xaGBD construct has not been tried ?
      • p.9 : "None of the pGBD sensors showed a clear membrane localization upon stimulation with histamine (Figure 4A). The increase in cytosolic intensity observed in some cells, seems to be caused by changes in cell shape." Do changes in HeLa cell shape induced by histamine stimulation? How this can be explained? Do some cells expressing the rGBD sensors (single, tandem and triple and dimericTomato) undergo these changes of shape too, upon histamine stimulation? If yes, to what extent these changes in cell shape affect signals?
      • p9: Overall, the paragraph about Fig 4 E,F is not clear. What amino acid sequences of G Protein Binding Domains of Anillin and PKN1 bring for the understanding of rGbD, aGBD and pGBD sensors?
      • p. 12, Fig 6C, Fig. 6E: "The membrane marker showed a relatively small increase in intensity after stimulation and the curve did not show the same pattern as the RhoA biosensor intensity curve. Therefore, we conclude that the increase in RhoA biosensor intensity is caused by relocalization." It surprises me that decrease in cell areas induced a very small increase in fluorescence intensity of the membrane marker. It would be very helpful to see a figure with a quantification of the membrane marker intensity changes during this process. What about a cytoplasmic marker? In addition, how does the movement artefact is corrected? "Our data revealed that the RhoA biosensor displays RhoA activity at subcellular locations where RhoA activity is expected, and appears mostly independent of fluorescent intensity measured by a separate membrane marker." This part should be developed further. Are there examples of cells for which the biosensor activity is dependent on fluorescent intensity measured by a separate membrane marker?
      • Discussion (p.16): "Comparing relocation sensors to FRET sensors, both have their own advantages and disadvantages." The dT2xrGBD sensor is here presented as a new relocation sensor for RhoA activity. However in general, there should be more development of the direct comparisons, pros and cons, with quantitative data or more details allowing to have a general overview of the advantages and disadvantages of this new relocation biosensor as compared to the existing ones. Minor points:
      • Overall, scale bars should have to be included in HeLa cells microscopy images.
      • It was not clear until the Methods section that the widefield analysis appeared to be normalized against another fluorescent protein-based cytoplasmic signal to correct for variations in cell volume. I think this point should be mentioned in the main text more prominently and emphasized so that readers are not misled.
      • p. 9 : "Anillin AH+PH sensor" instead of "Anillin AHD+PH sensor"
      • Fig 2B and 2D : Explain what parameter is used for the normalization of each signals ?
      • Fig. 1A, top panel: it would be good to know which images correspond to the addition of histamine and which ones correspond to the addition of pyrilamine
      • "TRIF microscopy" is written in legends of Fig. 6 and of Supplemental movie 11, and in Materiel and Methods section p. 23
      • Fig. 3 legend: Correct "mScralet-I-1xrGBD"
      • Fig 4F, legend: " Anillin and the bound RhoA are depicted in dark and light yellow, respectively. PKN1 and the bound RhoA are depicted in light and dark blue, respectively." Color codes in legend are opposites to the figure ones.
      • p.11 : "To examine this, we used a rapamycin-induced hetero dimerization system to recruit the dbl homology (DH) domain, of the RhoA activating GEF p63, to the membrane of the Golgi apparatus." Corresponding references should be included.
      • Fig. 5A : Explain FRB, Fig 5C : no unit for a ratio

      Significance

      Mahlandt et al. optimized and compared several G protein binding domain (GBD)-based biosensors in order to improve the potential of existing RhoA-domain-based biosensors for visualizing and reporting RhoA subcellular activity in living cells and tissue. The authors demonstrate that fusing a dimerizing fluorescent protein to the rhotekin GBD (rGBD) is an efficient strategy to increase the brightness of the sensor. The use of Rhotekin-RBD as affinity domain for Rho-class of GTPase is very well established, both in the methods of affinity pulldowns and in biosensor designs for Rho-class of GTPases in the field. The authors show that the dimericTomato-2xrGBD biosensor can indicate endogenous RhoGTPase spatial activity in dividing HeLa cells and during cell retraction of human endothelial cells.

      The dimericTomato-2xrGBD biosensor is thus introduced and described as a RhoA localization-based biosensor, however no experimental data demonstrate the binding specificity of the biosensor for RhoA. Moreover, authors discuss about a previous work showing that rGBD binds the three paralogs RhoA, RhoB and RhoC. This point and the apparent singular claim of this biosensor reporting RhoA activity as this manuscript alludes to are inappropriate and misleading. This point especially in light of the field has moved on in the past 20 years to assign more specificity (not less) to which GTPase the biosensors are being specific, i.e., via FRET, etc., significantly tempers the enthusiasm of this reviewer. In addition to this main issue, the incomplete characterization of the relative affinities of the domain to the target GTPase isoforms and of the dimerization affinities of the fluorescent proteins (which could change the apparent reaction rate constants), and the impact of which on the reversibility, oligomerization states and detection sensitivity, and the biology, also appeared lacking. Additional stoichiometric considerations and apparent reaction equilibrium that are impacted by the relative concentrations of interacting moieties require careful and further analyses, study and discussion. In general, I think that this work could be interesting to a more specialized field audience with further analyses of the affinities of the interacting moieties and better characterization of the behavior of this biosensor in living cells since it is likely causing oligomerization of the signaling units due to the forced dimerization of the detection unit.

      Referees cross-commenting

      This is a dimerizing probe. It gets pretty bulky. Is dimerization occurring prior to GTPase binding or after? Is the dimerized probe/GTPase complex somehow more stable than would otherwise be if they were monomeric? If so, how would that affect the lifetime of the detection and also the diffusivity of the probe("s", if already dimerized) and possibly the whole oligomer?

      It still feels to me that, yes new brighter fluorescent proteins were used, and dimerization andmultimerization of the signaling complex increased the SNR of the system, but the whole premise just reverted the biosensor field back 20yrs, which has been my biggest single concern regarding this paper.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Response to Reviewers

      Title: "Towards deciphering the Nt17 code: How the sequence and conformation of the first 17 amino acids in Huntingtin regulate the aggregation, cellular properties, and neurotoxicity of mutant Httex1".

      Tracking #: RC-2021-00675 Authors: Vieweg et al.

      MAJOR COMMENTS from Referees #1, #2, and #3

      Referee #1

      General comments

      « The manuscript by Vieweg, Mahul-Mellier, Ruggeri et al., describes the role of the sequence and conformation of the extreme N-terminus of the Huntingtin protein in terms of aggregation and toxicity together with its relation to the polyglutamine length. The authors use some outstanding methods to ensure that the conclusions are based on good quality data. Overall, this is an excellent study.

      We thank the referee for the very positive feedback and for recognizing the quality of our work and his/her appreciation of our systematic approach to dissect the role of the Nt17 domain in regulating the aggregation, cellular properties, and neurotoxicity of mutant Httex1.

      “The manuscript is generally well written although it might benefit from reducing the length of the discussion section ».

      We thank the referee for his/her valuable comment. We have reduced by 10% the discussion as per requested.

      Major comments

      1) “For their in vitro data, the authors do not go beyond 42 polyglutamines. Is there any particular reason for that? The authors see a clear difference between 36Q and 42Q, but although not critical, it would have been useful to use longer repeats. In my view, the authors should at least discuss the rationale for this, particularly as in cellular models they do use 72Q constructs.”

      We thank the referee for raising this point.

      Most HD patients have a polyQ repeat stretch of 40-45 glutamines (1-4).

      In vitro, the use of Httex1 constructs consisting of 42 polyglutamine residues is sufficient to induce mutant Httex1 aggregation and fibril formation. Mutant Httex1 proteins with polyQ repeats of 72Q or higher are highly aggregation-prone and difficult to purify, handle, or disaggregates. This is why all of the in vitro aggregation studies are based on mutant Htt proteins with polyQ ranging from 23Q-53Q (5-14). We have reviewed the literature carefully and were unable to identify any in vitro studies with recombinant Htt proteins containing polyQ repeats of 72 or greater.

      In cells, induction of mature Htt inclusions requires much longer polyQ repeats. This is clearly reflected by the fact that most cellular studies use mutant Htt with polyQ repeats above 64Q and up to 160Q to induce the formation of cellular aggregates (10, 15-30).

      We have recently conducted a systematic study on the effect of the polyQ repeat length on Htt inclusion formation in cells https://www.biorxiv.org/content/10.1101/2020.07.29.226977v1 (21). Characterization of the inclusions by EM revealed that the polyQ tract length dramatically influences the ultrastructure properties and the architecture of the Httex1 inclusions in cells. The dark shell structure that delimitated the core from the periphery of the Httex1 72Q inclusions was absent in the Httex1 39Q inclusions. Also, the Httex1 39Q inclusions appeared less dense compared to that of the Httex1 72Q. Finally, no significant cell death was observed in HEK cells overexpressing 39Q constructs while overexpression of Httex1 72Q was toxic. For these reasons, we and others select to use Httex1 with polyQ repeat of 75 or higher.

      2) « The role of the N-terminus 17 aminoacids of huntingtin (Nt17) is addressed by comparing peptides with and without the Nt17 and their relation to the adjacent polyglutamine tract. Using this approach, the peptide without the Nt17 is composed of pure polyglutamines in its N-terminus, followed by the rest of exon 1 in its C-terminus. This is clearly the key comparison to address the role of the Nt17 in the context of an exon1 containing polyQ. »

      Yes. In fact, we did perform this experiment and assessed if the addition of the Nt17 would be sufficient to inhibit mutant Httex1 aggregation or make ΔNt17-Httex1 aggregate similar to Httex1. This data is included in the original version of the manuscript as Figure S5 in supporting material. We observed that that __the presence of the Nt17 peptide during the aggregation of ΔNt17-exon1 fibrils did not interfere with the aggregation kinetic of mutant Httex1 or alter the fibril morphology of ΔNt17-exon1,__ indicating that intramolecular interactions between the Nt17 domain and the adjacent polyQ tract are key determinants of mutant HTtex1 fibrillization and fibril morphology.

      Referee #2

      General comments

      This article describes the results from studies into mechanisms of the aggregation and toxicity of Htt Exon1 protein. The authors investigated the role of N17, polyQ length, M9C mutation, and phosphorylation. Multiple approaches were used that included biochemical protein design, biophysical measurements, and cell biological experiments with cultured mammalian cells. The authors demonstrate the effects of protein context on aggregation. Furthermore, the authors were able to visualize the aggregates in mammalian cells and in neurons using multiple methods. These are interesting data.

      We greatly appreciate the positive feedbacks on our data and our systematic and integrative approaches.

      There are several major weaknesses in the study. First problem is that most of the results related to aggregation mechanisms and toxicity are not original and incremental when compared to many previously published articles.

      We respectfully disagree with this assessment and suggestion that our studies are not original and represent only incremental advances.

      Originality

      Our study provides novel mechanistic insights into the role of not only the sequence but also the conformational properties of the Nt17 domain in regulating the dynamics of Httex1 fibrillization, the kinetic fibril growth, the structure and morphology of Httex1 fibrils. Besides, we also addressed for the first time how the Nt17 domain and phosphorylation at different residues within this domain regulate the cellular uptake, subcellular localization (phosphorylated proteins) and toxicity of extracellular monomeric and fibrillar forms of mutant Httex1 in primary neurons. We are not aware of previous reports that have conducted similar studies addressing these questions and using multiple methods. Also, some of our findings using native Httex1 sequences are not in agreement with previous reports using Httex1 proteins fused to peptide/protein tags, thus underscoring the limitations of previous studies.

      1) We demonstrated that the Nt17 domain plays an important role in shaping the surface properties of mutant Httex1 fibrils and regulate their lateral association and cellular uptake. Our findings are not in agreement with previous findings published in eLife by Shen et al. __(10)__, where they reported that removal of the Nt17 domain has the opposite effect of what we observed in our study, i.e., ∆Nt17 promotes the formation of fibrils that exhibit a low tendency to laterally associated and form a “bundled” architecture (10). Careful examination of their constructs revealed that all the proteins they used contained a highly charged 15-mer peptide tag (S-tag: Lys-Glu-Thr-Ala-Ala-Ala-Lys-Phe-Glu-Arg-Gln-His-Met-Asp-Ser) at the C-terminus of Httex1, which we believe would strongly influence the aggregation properties of the mainly uncharged ∆Nt17-Httex1 and Httex1 protein, thus possibly explaining the discrepancy between our findings and those of Shen et al (10). In fact, a previous study has shown that adding short peptide tags such as the HA or the LUM tag to mutant Htt171 changed the toxicity dramatically. In addition, we show that the subcellular localization of Htt171 expressed in cells (e.g: expression of Htt171 carrying the LUM tag was more toxic than untagged Htt171 and induced the formation of nuclear aggregates rather than the classical cytoplasmic aggregates) (31). The reference to this paper is now included and discussed in the main manuscript (page 11).

      Our observations __highlight__ the critical importance of using tag-free proteins to investigate the sequence and structural determinants of Httex1 aggregation and structure.

      2) Our study is the first to demonstrate that the Nt17 domain influences the relationship between fibril length and polyQ repeat length. This correlation disappears when the Nt17 is removed. This aspect of our work was not explored by Shen et al. (10), who limited their in vitro aggregation study to Httex1 wild-type and mutants (∆Nt17 or ∆PRD for Polyn Rich Domain).

      3) This is also the first study to assess the effect of modulating the helicity of Nt17 on fibrils growth and morphology and Httex1 cellular properties.

      1. a) Using the helix and membrane-binding disrupting mutation (M8P), we showed that disrupting the Nt17 helix (M8P mutation) slows the aggregation propensity of Httex1 in vitro but does not alter the morphology of the fibrils. In contrast, removing the Nt17 domain leads to a strong lateral association of the fibrillar aggregates with ribbon-like morphology. This demonstrated that the __Nt17 sequence, but not its helical structure, is the key determinant of the quaternary packing of Httex1 fibrils. Shen et al. (10) did not investigate the effect of modulating the helicity of Nt17 on fibrils growth and morphology using M8P mutant__.
      2. b) Our cellular studies comparing the membrane association and uptake of extracellularly added Httex1 43Q and M8P Httex1 43Q fibrils in primary rat striatal neurons showed that disrupting the Nt17 helix promotes the internalization of M8P Httex1 while Httex1 stays bound to the plasma membrane. These findings suggest that the Nt17 helical conformation persists in the fibrillar state or that the Nt17 domain regains its helical structure upon interaction with the plasma membrane resulting in the sequestration of Httex1 fibrils at the membrane and impeding their uptake. This aspect of our study has never been explored in previous studies.
      3. c) Using the site-specific bona fide phosphorylation on T3, S13, SS16, and both S13/S16, this is the first study that shows that modulation of the overall helicity of Httex1 through site-specific phosphorylation of the Nt17 domain (pT3 stabilizes the alpha-helical conformation of Nt17 while pS13 and/or pS16 disrupts it (9)) enhance the rapid uptake of extracellular Httex1 monomeric species into neurons and their nuclear accumulation. Previous studies relied on phosphomimetic mutations (32), which we have shown do not reproduce the effect of phosphorylation at these residues on the structure of Nt17 (8, 9). Shen et al. __(10) did not investigate the effect of modulating the helicity of Nt17 on fibrils growth and morphology using site-specific __phosphorylation of the Nt17 domain. 4) Our overexpression model in HEK cells showed that removing the Nt17 domain or disrupting its helical structure (M8P mutation) was sufficient to prevent the cell death induced by Httex1 72Q overexpression and reduce the number of cells with inclusions drastically. Our data indicate that the cell death level correlates with the number of cells that contain inclusions or the number of inclusions formed in the cells or/and their subcellular localization. In contrast to our results, Shen et al.,__(10)__ demonstrated that the overexpression of ΔN17-Httex1 induced toxicity at a similar level as the full-length Httex1 in striatal-derived neurons or neurons from cortical rat brain slices culture, although ΔN17-Httex1 led to a significant reduction of punctate structures in these cells. The discrepancy between these studies and our Httex1 overexpression model in HEK cells may be due to the fact that in neurons, Httex1 lacking the Nt17 domain accumulates in the nucleus. In contrast, in HEK, it stayed mostly cytosolic. In line with this hypothesis, it has been recently shown that cytosolic inclusions (Httex1 200Q) and nuclear aggregates (Httex1 90Q) contribute – to various extents – to the onset and the progression of the disease in a transgenic HD mice model (33). Thus, the difference in cellular localization but also the cell type (HEK vs. neurons) could influence the toxic response of the cells to the overexpression of ∆Nt17-Httex1, with toxicity triggered only by the nuclear ∆Nt17-Httex1 species.

      5) This is also the first study to investigate the role of the Nt17 domain and Nt17 PTMs in influencing the uptake, the subcellular localization, and the toxicity of extracellular Httex1 species (monomers and fibrils) in primary neurons. We showed that the helical propensity of Nt17 strongly influences the uptake of Httex1 fibrils into primary striatal neurons. At the same time, phosphorylation (at T3 or S13/S16) or removal of the Nt17 domain increased the uptake and accumulation of Httex1 fibrils into the nucleus and induced neuronal cell death. Our findings suggest that the Nt17 domain is exposed in the fibrillar state and is sufficiently dynamic to mediate fibril-membrane interactions and internalization.

      Altogether our results, combined with previous findings from our groups and others demonstrating the role of Nt17 in regulating Htt degradation (34-36), suggest that this domain serves as one of the key master regulators of Htt aggregation subcellular localization of the pathological aggregates, and their toxicity. They further demonstrate that targeting Nt17 represents a viable strategy for developing disease-modifying therapies to treat HD.

      Limitations of previous studies:

      Although the effects of the Nt17 domain in regulating Httex1 aggregation and cellular properties have been studied and reported on by other groups, we would like to stress that most of the published studies had major limitations and used protein constructs that do not share the sequence of native Httex1 and exhibit biophysical and cellular properties that differ from those of native Httex1 sequences.

      1) Many of the studies used Httex1-like model peptides (13, 37), which do not contain the complete sequence of Httex1 (e.g., Nt17 peptide (37)), contain additional solubilizing amino acids such as lysine residues(38-43) or are fused to large proteins (e.g., GST, YFP) (37).

      2) Other studies relied on artificial fusion constructs whereby the polyQ domain (44-46) or Httex1 itself (12, 47-61) are fused to large solubilizing protein tags, such as glutathione-S-transferase (GST), maltose-binding protein (MBP) or thioredoxin (TRX) or C-terminal S-tag (10, 62, 63) or fluorescent proteins (e.g., GFP or YFP) (10, 15, 49, 64, 65) for the cellular studies.

      One of the major limitations of using fusion constructs as precursors for the generation of Httex1 (12, 47-61) is the requirement to cleave the fusion protein in situ by adding a protease to release and initiate the aggregation of Httex1. Enzyme-mediated cleavage of Httex1-fusion proteins often results in the incorporation of additional amino acids at the N- or C-terminus of the protein. This could alter the biophysical and biochemical properties of Httex1 because of the important role of the Nt17 domain and the proline-rich domain in regulating the conformational and aggregation properties of the protein (38, 40, 43, 65, 66). Moreover, it has been shown that commonly used enzymes such as trypsin and thrombin may lead to cleavages within the Nt17 domain and result in the generation of undesired Httex1 fragments (7, 42, 60). The net effect of incomplete and/or unspecific enzymatic cleavage of Httex1 fusion proteins is the generation of heterogeneous protein mixtures, which precludes accurate interpretation and comparison of aggregation and structural data across different laboratories.

      Moreover, several studies have shown that the fusion of small peptide tags or large fusion protein alters the aggregation of mutant Httex1 in vitro and in cells.

      1. We have previously shown that the presence of such tags (e.g., GST) alters the ultrastructural and biochemical properties of Httex1 as well as its aggregation properties in vitro (11).

      We have also recently completed a comprehensive assessment of the GFP tag's impact on the aggregation, inclusion formation, and cellular properties of Httex1 (preprint paper available in BioRxiv https://www.biorxiv.org/content/10.1101/2020.07.29.226977v1 (21)). In this paper, we show that inclusions produced by mutant Httex1 72Q-GFP exhibit striking differences in terms of organization, ultrastructural properties, composition, and their impact on mitochondria functions as compared to the inclusions formed by the tag-free mutant Httex1 72Q. These findings highlight the critical importance of developing new tools that minimize the impact of large fluorescent proteins and/or label-free imaging methods and monitoring Htt aggregation in inclusion formation in cells.

      From Riguet et al., __(21)__. Influence of GFP on the ultrastructural properties of Httex1 cellular inclusions by Correlative light electron microscopy (CLEM). CLEM of Httex1 72Q (+/-GFP) transfected in HEK 293 cells after 48h. Confocal images of A. Httex1 72Q and. B Httex1 72Q GFP, 48h after transfection. Httex1 expression (red) was detected using a specific primary antibody against the N-terminal part of Htt (amino acids 50-64) and the nucleus was stained with DAPI (blue). Electron micrographs of C. Httex1 72Q and D. Httex1 72Q GFP inclusions corresponding to confocal images panel A, and B (white square), respectively. Add-in binary images generated from electron micrographs by median filtering and Otsu intensity threshold. E. **Schematic depictions and original electron micrographs of cytoplasmic inclusions formed by native (tag-free) mutant Huntington exon1 proteins (Httex1 72Q, left) and the corresponding GFP fusion protein (Httex1 72Q-GFP).

      A recent study by Chongtham et al. (31) also supports our findings and shows that adding short peptide tags such as the HA or the LUM tag to mutant Htt171 changed dramatically the toxic properties of Htt171 as well as its subcellular localization and the compactness of the aggregates formed in cells (e.g.: expression of Htt171 carrying the LUM tag was more toxic than untagged Htt171 and induce the formation of nuclear aggregates rather than the classical cytoplasmic aggregates, See Figure 4).

      Figure 4 from Chongtham et al., __(31)__. The influence of peptide modifications on HTT171 fragment behavior. (A) When expressed ubiquitously with da>Gal4, the HTT171-120Q fragment exhibits little or no lethality, but appending either an HA ( ... YPYDVPDYA)oraLUMtag ( ... GCCPGCCGG) to the C-terminus dramatically increases the toxicity of the fragment. (B) Surviving adult flies expressing an HA-tagged HTT171 transgene exhibit about half the life span of those expressing untagged 171. Flies expressingLUM-tagged HTT171 do not survive to adulthood. (C) Flies expressing HA- or LUM-tagged 171 in tracheal cells show only modest increases in lethality that do not rise to the level of significance (P=0.12; 0.09), but the inclusion of tags changes the subcellular behavior significantly. (D) In contrast, in the prothoracic gland, expression of LUM-tagged 171 shows a significant increase in toxicity compared to 171 alone, while the HA-tagged 171 borders on significance (P=0.051). (E) In trachea, pure 171 forms cytoplasmic aggregates, while the inclusion of HA causes some HTT to become nuclear diffuse, and inclusion of the LUM tag causes the bulk of the HTT to appear as diffuse nuclear material with some cytoplasmic aggregates remaining when expressed with btl>Gal4 at 29◦C. (F) In the prothoracic gland, addition of the LUM tag causes aggregated- **cytoplasmic HTT to become weakly staining diffuse-cytoplasmic material while HTT171HA remains as extensive aggregates in the cytoplasm with a haze of diffuse staining as well. Scale bars are 10 μm

      Therefore, in this study, we aimed to investigate for the first time the role of the sequence and the conformational properties of the Nt17 domain in regulating the dynamics of Httex1 fibrillization, the structure and morphology of Httex1 fibrils using a tag-free Httex1 constructs. In our studies, we used multiple methods to examine the structural and cellular properties of these proteins under the same conditions and in the same cellular systems, thus making it possible to correlate the sequence, structural and cellular properties of the different Httex1 proteins (monomers and fibrils). We are happy to see that this was nicely recognized and appreciated by the referee.

      Referee #1 “The authors use some outstanding methods to ensure that the conclusions are based on good quality data. Overall, this is an excellent study.” Referee #3 “Their findings provided the precise information for the role of tag-free Nt17. The paper advanced our knowledge of Nt17, especially in the Huntington disease field.”

      Major comments

      Referee #2 raised the following concerns:

      1) « The main hypothesis of this study solely depends on the ability of N17 domain to enhance aggregation (Fig 1 and Fig 2). According to the method for the protein solubilization 1mM TCEP was added to ∆Htt-Ex1, but not to Htt-Ex1 proteins. It is necessary to rule out the potential effects of TCEP on aggregation assay. »

      We thank the referee for raising this important point. Indeed, we were also concerned about the potential effect of TCEP and conducted experiments to address this point. Our data show that TCEP does not affect our aggregation assay. This new panel is now included in supporting information as Figure S2A-B and mentioned in the corresponding section of the Material and method (page 29).

      2) « The author needs to provide biophysical data of the mutation and phosphorylated proteins with/without Tag. »

      All the proteins used in this study have been extensively characterized in recent publications from our lab __(9, 11, 21)__. All these papers are cited throughout our manuscript as well as in the material and method section.

      The expression, purification and characterization of native tag-free Httex1 with polyQ repeats ranging from 7 to 49Q has been fully described in Vieweg et al., 2016 __(11)__. In this paper, the aggregation properties of tag-free Httex1 and Httex1 fused to GST or MBP tags were compared by sedimentation assay, while the morphology and length of the resulting fibrils were compared by EM.

      The semisynthesis, purification, and characterization of Httex1 42Q phosphorylated at Ser-13 and/or Ser-16 or at T3 was described respectively in Deguire et al., 2018 __(9) and Chiki et al., 2017 (8)__. These studies include kinetics of aggregation and morphological assessment (i.e: heights and lengths) by EM and AFM of the fibrils formed by phosphorylated or unphosphorylated mutants Httex1.

      The Httex1 mutants carrying the GFP tag were not used in the in vitro studies but were studied in our overexpression-based cellular model. The direct comparison characterization of inclusion formation by tag-free and GFP-tagged mutant Httex1 and their impact on cellular homeostasis are fully described in a preprint paper available in BioRxiv https://www.biorxiv.org/content/10.1101/2020.07.29.226977v1 (21). In this paper, we show that inclusions produced by mutant Httex1 72Q-GFP exhibit striking differences in terms of organization, ultrastructural properties, composition, and their impact on mitochondria functions as compared to the inclusions formed by the tag-free mutant Httex1 72Q. These findings highlight the critical importance of developing new tools that minimize the impact of large fluorescent proteins and/or label-free imaging methods and monitoring Htt aggregation in inclusion formation in cells.

      Referee #3

      General comments

      “Their findings provided the precise information for the role of tag-free Nt17.__ The paper advanced our knowledge of Nt17, especially in the Huntington disease field.”__

      We thank referee #3 for the very positive feedback and for recognizing the quality, depth and significance of our work and its potential impact in the field of Huntingtin disease.

      “However, the conceptual advance is limited.”

      We respectfully disagree with this assessment that the conceptual advance of our study is limited.

      Please see our detailed response to Referee #2 regarding our work's originality and novelty (pages 3-8, in our referees' letter).

      Major comments

      Referee #3 raised the following concerns:

      1) Finding of lateral association (bundling) of __Δ__Nt17-Httex1 fibrils is interesting.

      We agree and thank the referee for further highlighting this point.

      However, pathological significance is not clear

      We agree that the significance for delta 17 is not clear as we do not know whether this cleavage occurs in vivo or not. This is why we decided to extend our studies beyond the removal of Nt17 and investigated the effect of natural PTMs that are known to alter the sequence of Nt17 and modulate its helicity. One additional distinguishing feature of our work is that we used proteins (monomers and fibrils) that bear site-specific bona fide phosphorylation on T3, S13, SS16, and both S13/S16.

      1. a) Does even non-truncated form also increase this kind of bundling when polyQ is expanded? We have addressed this specific question in a previous study (11) in which we have compared the morphology and length of fibrils formed from Httex1 with polyQ tract from 23Q to 43Q. The increased lateral association was not observed for the fibrils generated from Httex1 43Q or Httex1 23Q, 29Q, or 37Q (Figure 5F) (11). Besides, in this paper, we were the first to show an inverse correlation between the polyQ-length and fibril length, which suggests structural differences between Httex1 proteins with different polyQ repeat lengths. Others have investigated Httex1 with different polyQ repeat, but not using tag-free Httex1 proteins, and they did not observe this inverse relationship between polyQ lenth and fibril length, as we did here and in our previous studies (11).

      2. b) When fibrils are added to striatal neurons like in Fig.5, is this structural feature preserved on the membrane or inside of the cells? We agree with the referee that this is an important point to address. However, deciphering the structural properties of the membrane-bound and internalized fibrils is not trivial, especially given the limited amount of unlabelled fibrils that are taken up by the cells. This would require extensive optimization of the CLEM technique or the use of an alternative approach such as tomography. Due to the resources and time required to address this important question, this part of the project will be included as part of future projects aimed at investigating the mechanisms of Htt seeding and propagation. We are not aware of any reports by other groups that monitor the structural changes of exogenous fibril after internalization into cells.

      3. c) When Httex1 fibrils species are expressed, is this bundling also observed? In fact, we have recently completed a comprehensive analysis of the comparison of the inclusions formed by mutant Httex1-72Q and ΔNt17 Httex1.

      In this study, we have shown that the expression of Httex1 72Q and the truncated form ΔNt17 Httex1 72Q form cytosolic inclusions of similar size and shape in HEK 293 (Figure 4). We have further characterized the architecture and organization of these inclusions at the ultrastructural level in the context of another project. Our findings are now available online (see Riguet et al., 2021, BioRxiv (21)).

      Using correlative light electron microscopy, we showed that the inclusions formed by Httex1 full length or lacking the Nt17 domain exhibited similar architecture and a ring-like organization. Interestingly, we showed the inclusions are composed of highly organized fibrillar network at the core and periphery of the inclusions. In cells inclusion formation is a multiphasic process driven by different phases of polyQ dependent aggregation processes and complex interactions with lipids, proteins and organelles (ER).

      Although CLEM approach in neurons provides very good contrast of cytosolic or nuclear inclusions, the resolution of this method is not sufficient to allow imaging at the level of individual fibrils and assessing their morphology. Differences between CLEM and EM resolution can be explained as the slices of the cellular objects are much thicker (~ 50 nm) than the fibrils prepared in vitro and directly deposited on the EM grids (the height of Httex1 pre-formed fibrils is between 5 and 7 nm). To improve the imaging and get a stronger contrast of de novo fibrils in our CLEM samples, we used a double-contrast method based on uranyl acetate and lead citrate stains. Nevertheless, the complex cellular environment and the presence of various cellular objects (e.g: organelles and proteins) surrounding the de novo fibrils might prevent the optimal stain penetration from allowing imaging at the level of individual fibrils. Finally, the preparation of the neuronal samples for CLEM imaging includes ethanol and detergents incubation and resin embedding. These steps can limit the ultrastructure detection of the de novo fibrils at the level of individual fibrils and therefore does not allow to determine their organization and their lateral association.

      1. d) What function (cell death, membrane integrity or others) is most correlated with this structural feature? In our extracellular model, we have shown that the conformation and sequence properties of the Nt17 domain are key determinants of the internalization and the subcellular localization of Httex1 fibrils in primary striatal neurons. Httex1 43Q fibrils mostly accumulate at the outer side of the neuronal plasma membrane, Httex1-ΔNt17 43Q fibrils were detected primarily in the nucleus and the M8P-Httex1 43Q fibrils were equally distributed in the cytosol and nucleus.

      Despite exhibiting completely different subcellular distribution and internalization levels, the 3 types of fibrils induced neuronal cell death with the highest toxicity observed for ∆Nt17-Httex1 (Figure 7).

      Our data suggest that the neurotoxic response is primarily dependent on the subcellular localization of the Httex1 species: 1) accumulation of the Httex1 43Q fibrils on the plasma membrane is likely to induce loss of membrane integrity, based on previous observation with aSyn fibrils; 2) the nuclear accumulation of ∆Nt17-Httex1 aggregates has been previously shown to be highly toxic in several cellular and animal models (10, 67, 68).

      Nevertheless, we could not rule out that the high toxicity of ΔNt17-Httex1 fibrils could also be due to their distinct biophysical and structural properties. ΔNt17-Httex1 forms broad fibrils characterized by lateral association, which could provide a surface for the sequestration of intracellular proteins.

      2) The authors claimed, “we investigated for the first time, the role of the Nt17 sequence, PTMs and conformation in regulating the internalization and cell-to-cell propagation of monomeric and fibrillar forms of mutant Httex1”. However, so far this reviewer understands that the authors studied the internalization but not cell-to-cell propagation.

      We agree and apologize for this mistake as we indeed only limited our study to the uptake, subcellular localization, and toxicity of extracellular Httex1 species in our primary neuronal model. The text has been amended, and cell-to-cell propagation has been removed from the abstract as well as in pages 2, 4, 12, and 17.

      Minor points for Referees #1, #2 and #3

      Referee #1

      1. « On page 6, the data on how the Nt17 domain affects Httex1 aggregation, the information on which figure it is referring to is missing. Done. The information regarding the Figure related to this data has now been added page 6.

      In Figure 1A, it is difficult to compare the data on Nt17 and DNt17, particularly for 36Q and 42Q, as the time axis are different. I understand that the kinetics are different, but particularly for the 42Q peptides (Nt17 and DNt17) as their kinetics are not that different, it may be useful to show them in the same panel. »

      Done. The new panel that combined the data on Nt17 and DNt17 has now been added as Figure S1B.

      Referee #2

      1) “Fig 8 the color codes for PolyQ and PolyP need to be corrected. »

      Done.

      2) “It is a challenging technical problem to produce proteins which are rich in Pro and Gln content. But there is not enough experimental details provided in the methods. Please add detailed procedures for expression and purification of these proteins. »

      We thank referee #2 for recognizing the technical challenges to express and produce Httex1 proteins and mutants. The expression, purification and characterization methods of all the proteins used in this manuscript have been extensively detailed in our previous studies (8, 9, 11, 69-71). We have now added the relevant references in the method section (page 29).

      Referee #3

      1) « Fig.3B arrowhead could not be seen. »

      Done. Arrowheads are now added to Fig. 3B.

      2) « Fig.4A: what do arrows mean? No scale bars? »

      The arrows indicate the aggregates formed in HEK cells overexpressing Httex1 39Q and 72Q. This now added to the legend section of the Figure 4.

      The scale bars are already present in both the main and the insets images.

      3) « Fig.5A:no scale bars? »

      Done. Scale bars were added in the 4 images where they were missing.

      4) « Fig.S3. Height and length seem to be wrong. »

      The measurement of height and length are performed as in literature (72), and are consistent with previous studies (8, 9, 11).

      5) « Fig.S6C: hard to compare. D: What is Htt2-90? Also in Fig.S13. »

      We thank the referee for bringing this to our attention and apologize for the lack of consistency in the names used for the proteins studied in Figures S6 and S13. We realized that in Figures S6 and S13 the names of the proteins have been either mislabelled due to the dash that was misplaced or the same proteins have been named in different ways. We agree that this makes it difficult to compare the data between the different panels. We have now corrected our mistakes and Figures S6B and S13 have been updated accordingly.

      The name Htt2-90 corresponds to Httex1 expressed from amino acid 2 to amino acid 90, with the first N-terminal methionine removed.

      6) « There are many abbreviations difficult to understand in supplement. » Fig.S1 Htt18-90(Q18C) etc.

      His6-Intein Ssp stands for the Intein tagged with Histidine amino acid (6 units)

      Htt18-90(Q18C) means Httex1 expressed from amino acid 18 to amino acid 90 with the Glutamine (Q) in position 18 mutated in a Cysteine (C).

      Htt2-17 means Httex1 synthesized from amino acid 2 to amino acid 17, with the first N-terminal methionine removed.

      Htt18-90(Q18A) corresponds to Httex1 expressed from amino acid 18 to amino acid 90 with the Q in position 18 mutated in an Alanine (A).

      References

      1. M. E. MacDonald, S. Gines, J. F. Gusella, V. C. Wheeler, Huntington’s Disease. NeuroMolecular Medicine 4, 7-20 (2003).
      2. S. E. Andrew et al., The relationship between trinucleotide (CAG) repeat length and clinical features of Huntington's disease. Nature genetics 4, 398-403 (1993).
      3. J. F. Gusella, M. E. MacDonald, Huntington's disease: the case for genetic modifiers. Genome Med 1, 80 (2009).
      4. J. M. Andresen et al., The relationship between CAG repeat length and age of onset differs for Huntington's disease patients with juvenile onset or adult onset. Ann Hum Genet 71, 295-301 (2007).
      5. A. S. Wagner et al., Self-assembly of Mutant Huntingtin Exon-1 Fragments into Large Complex Fibrillar Structures Involves Nucleated Branching. Journal of molecular biology 430, 1725-1744 (2018).
      6. Y. Sun, A. Savanenin, P. H. Reddy, Y. F. Liu, Polyglutamine-expanded Huntingtin Promotes Sensitization of N-Methyl-D-aspartate Receptors via Post-synaptic Density 95. Journal of Biological Chemistry 276, 24713-24718 (2001).
      7. A. Ansaloni et al., One-pot semisynthesis of exon 1 of the Huntingtin protein: new tools for elucidating the role of posttranslational modifications in the pathogenesis of Huntington's disease. Angewandte Chemie (International ed. in English) 53, 1928-1933 (2014).
      8. A. Chiki et al., Mutant Exon1 Huntingtin Aggregation is Regulated by T3 Phosphorylation-Induced Structural Changes and Crosstalk between T3 Phosphorylation and Acetylation at K6. Angewandte Chemie International Edition 10.1002/anie.201611750, 1-6 (2017).
      9. S. M. DeGuire et al., N-terminal Huntingtin (Htt) phosphorylation is a molecular switch regulating Htt aggregation, helical conformation, internalization, and nuclear targeting. Journal of Biological Chemistry 293, 18540-18558 (2018).
      10. K. Shen et al., Control of the structural landscape and neuronal proteotoxicity of mutant Huntingtin by domains flanking the polyQ tract. eLife 5, 1-29 (2016).
      11. S. Vieweg, A. Ansaloni, Z. M. Wang, J. B. Warner, H. A. Lashuel, An intein-based strategy for the production of tag-free huntingtin exon 1 proteins enables new insights into the polyglutamine dependence of Httex1 aggregation and fibril formation. Journal of Biological Chemistry 291, 12074-12086 (2016).
      12. J. Legleiter et al., Mutant huntingtin fragments form oligomers in a polyglutamine length-dependent manner in Vitro and in Vivo. Journal of Biological Chemistry 285, 14777-14790 (2010).
      13. B. Sahoo, D. Singer, R. Kodali, T. Zuchner, R. Wetzel, Aggregation behavior of chemically synthesized, full-length huntingtin exon1. Biochemistry 53, 3897-3907 (2014).
      14. J. C. Boatz et al., Protofilament structure and supramolecular polymorphism of aggregated mutant huntingtin exon 1. Journal of molecular biology 10.1016/j.matdes.2019.108334 (2020).
      15. F. J. B. Bäuerlein et al., In Situ Architecture and Cellular Interactions of PolyQ Inclusions. Cell 171, 179-187.e110 (2017).
      16. A. Iwata, B. E. Riley, J. A. Johnston, R. R. Kopito, HDAC6 and microtubules are required for autophagic degradation of aggregated Huntingtin. Journal of Biological Chemistry 280, 40282-40292 (2005).
      17. M. Kim et al., Mutant Huntingtin Expression in Clonal Striatal Cells : Dissociation of Inclusion Formation and Neuronal Survival by Caspase Inhibition. The Journal of neuroscience : the official journal of the Society for Neuroscience 19, 964-973 (1999).
      18. Y. E. Kim et al., Soluble Oligomers of PolyQ-Expanded Huntingtin Target a Multiplicity of Key Cellular Factors. Molecular cell 63, 950-964 (2016).
      19. H. Luo et al., Herp Promotes Degradation of Mutant Huntingtin: Involvement of the Proteasome and Molecular Chaperones. Molecular neurobiology 10.1007/s12035-018-0900-8 (2018).
      20. T. R. Peskett et al., A Liquid to Solid Phase Transition Underlying Pathological Huntingtin Exon1 Aggregation. Molecular cell 10.1016/j.molcel.2018.04.007, 1-14 (2018).
      21. N. Riguet et al., Disentangling the sequence, cellular and ultrastructural determinants of Huntingtin nuclear and cytoplasmic inclusion formation. bioRxiv 10.1101/2020.07.29.226977, 2020.2007.2029.226977 (2020).
      22. K. Tagawa et al., Distinct aggregation and cell death patterns among different types of primary neurons induced by mutant huntingtin protein. Journal of Neurochemistry 89, 974-987 (2004).
      23. Z. Zheng, A. Li, B. B. Holmes, J. C. Marasa, M. I. Diamond, An N-terminal nuclear export signal regulates trafficking and aggregation of huntingtin (Htt) protein exon 1. Journal of Biological Chemistry 288, 6063-6071 (2013).
      24. S. W. Davies et al., Formation of Neuronal Intranuclear Inclusions Underlies the Neurological Dysfunction in Mice Transgenic for the HD Mutation. Cell 90, 537-548 (1997).
      25. L. Mangiarini et al., Exon I of the HD gene with an expanded CAG repeat is sufficient to cause a progressive neurological phenotype in transgenic mice. Cell 87, 493-506 (1996).
      26. D. Martindale et al., Length of huntingtin and its polyglutamine tract influences localization and frequency of intracellular aggregates. Nature genetics 18, 150-154 (1998).
      27. J. Ochaba et al., PIAS1 Regulates Mutant Huntingtin Accumulation and Huntington's Disease-Associated Phenotypes In Vivo. Neuron 90, 507-520 (2016).
      28. G. Schilling et al., Intranuclear inclusions and neuritic aggregates in transgenic mice expressing a mutant N-terminal fragment of huntingtin. Human Molecular Genetics 8, 397-407 (1999).
      29. G. Matsumoto, S. Kim, R. I. Morimoto, Huntingtin and Mutant SOD1 Form Aggregate Structures with Distinct Huntingtin and Mutant SOD1 Form Aggregate Structures with Distinct Molecular Properties in Human Cells. The Journal of biological chemistry 281, 4477-4485 (2006).
      30. S. Waelter et al., Accumulation of mutant huntingtin fragments in aggresome-like inclusion bodies as a result of insufficient protein degradation. Molecular biology of the cell 12, 1393-1407 (2001).
      31. A. Chongtham et al., Effects of flanking sequences and cellular context on subcellular behavior and pathology of mutant HTT. Human molecular genetics 29, 674-688 (2020).
      32. T. Maiuri, T. Woloshansky, J. Xia, R. Truant, The huntingtin N17 domain is a multifunctional CRM1 and ran-dependent nuclear and cilial export signal. Human Molecular Genetics 22, 1383-1394 (2013).
      33. C. Landles et al., Subcellular Localization And Formation Of Huntingtin Aggregates Correlates With Symptom Onset And Progression In A Huntington’S Disease Model. Brain Communications 2 (2020).
      34. C. Cariulo et al., Phosphorylation of huntingtin at residue T3 is decreased in Huntington’s disease and modulates mutant huntingtin protein conformation. Proceedings of the National Academy of Sciences 10.1073/pnas.1705372114, 201705372-201705372 (2017).
      35. R. N. Hegde et al., TBK1 phosphorylates mutant Huntingtin and suppresses its aggregation and toxicity in Huntington's disease models. The EMBO journal 10.15252/embj.2020104671, e104671 (2020).
      36. L. M. Thompson et al., IKK phosphorylates Huntingtin and targets it for degradation by the proteasome and lysosome. Journal of Cell Biology 187, 1083-1099 (2009).
      37. R. S. Atwal et al., Kinase inhibitors modulate huntingtin cell localization and toxicity. Nature chemical biology 7, 453-460 (2011).
      38. A. Bhattacharyya et al., Oligoproline effects on polyglutamine conformation and aggregation. Journal of molecular biology 355, 524-535 (2006).
      39. S. Chen, V. Berthelier, W. Yang, R. Wetzel, Polyglutamine aggregation behavior in vitro supports a recruitment mechanism of cytotoxicity. Journal of molecular biology 311, 173-182 (2001).
      40. S. L. Crick, K. M. Ruff, K. Garai, C. Frieden, R. V. Pappu, Unmasking the roles of N- and C-terminal flanking sequences from exon 1 of huntingtin as modulators of polyglutamine aggregation. Proceedings of the National Academy of Sciences 110, 20075-20080 (2013).
      41. K. Kar, M. Jayaraman, B. Sahoo, R. Kodali, R. Wetzel, Critical nucleus size for disease-related polyglutamine aggregation is repeat-length dependent. Nature Structural and Molecular Biology 18, 328-336 (2011).
      42. R. Mishra et al., Serine phosphorylation suppresses huntingtin amyloid accumulation by altering protein aggregation properties. Journal of molecular biology 424, 1-14 (2012).
      43. A. K. Thakur et al., Polyglutamine disruption of the huntingtin exon1 N-terminus triggers a complex aggregation mechanism Ashwani. Nat Struct Mol Biol 16, 380-389 (2009).
      44. D. Bulone, L. Masino, D. J. Thomas, P. L. San Biagio, A. Pastore, The interplay between PolyQ and protein context delays aggregation by forming a reservoir of protofibrils. PloS one 1, e111 (2006).
      45. L. Masino, G. Kelly, K. Leonard, Y. Trottier, A. Pastore, Solution structure of polyglutamine tracts in GST-polyglutamine fusion proteins. FEBS Letters 513, 267-272 (2002).
      46. Y. Nagai et al., A toxic monomeric conformer of the polyglutamine protein. Nature Structural and Molecular Biology 14, 332-340 (2007).
      47. E. J. Bennett, N. F. Bence, R. Jayakumar, R. R. Kopito, Global Impairment of the Ubiquitin-Proteasome System by Nuclear or Cytoplasmic Protein Aggregates Precedes Inclusion Body Formation. Molecular cell 17, 351-365 (2005).
      48. C. W. Bugg, J. M. Isas, T. Fischer, P. H. Patterson, R. Langen, Structural features and domain organization of huntingtin fibrils. Journal of Biological Chemistry 287, 31739-31746 (2012).
      49. P. R. Dahlgren et al., Atomic force microscopy analysis of the Huntington protein nanofibril formation. Nanomedicine: Nanotechnology, Biology, and Medicine 1, 52-57 (2005).
      50. W. C. Duim, Y. Jiang, K. Shen, J. Frydman, W. E. Moerner, Super-resolution fluorescence of huntingtin reveals growth of globular species into short fibers and coexistence of distinct aggregates. ACS Chemical Biology 9, 2767-2778 (2014).
      51. J. M. Isas, R. Langen, A. B. Siemer, Solid-State Nuclear Magnetic Resonance on the Static and Dynamic Domains of Huntingtin Exon-1 Fibrils. Biochemistry 54, 3942-3949 (2015).
      52. J. Legleiter et al., Monoclonal antibodies recognize distinct conformational epitopes formed by polyglutamine in a mutant huntingtin fragment. Journal of Biological Chemistry 284, 21647-21658 (2009).
      53. E. Monsellier, V. Redeker, G. Ruiz-Arlandis, L. Bousset, R. Melki, Molecular interaction between the chaperone Hsc70 and the N-terminal flank of huntingtin exon 1 modulates aggregation. Journal of Biological Chemistry 290, 2560-2576 (2015).
      54. P. J. Muchowski et al., Hsp70 and Hsp40 chaperones can inhibit self-assembly of polyglutamine proteins into amyloid-like fibrils. Proceedings of the National Academy of Sciences 97, 7841-7846 (2000).
      55. Y. Nekooki-machida et al., Distinct conformations of in vitro and in vivo amyloids of huntingtin-exon1 show different cytotoxicity. Proceedings of the National Academy of Sciences 106, 9679-9684 (2009).
      56. L. G. Nucifora et al., Identification of novel potentially toxic oligomers formed in vitro from mammalian-derived expanded huntingtin exon-1 protein. Journal of Biological Chemistry 287, 16017-16028 (2012).
      57. L. Pieri, K. Madiona, L. Bousset, R. Melki, Fibrillar α-synuclein and huntingtin exon 1 assemblies are toxic to the cells. Biophysical Journal 102, 2894-2905 (2012).
      58. M. A. Poirier et al., Huntingtin spheroids and protofibrils as precursors in polyglutamine fibrilization. Journal of Biological Chemistry 277, 41032-41037 (2002).
      59. E. Scherzinger et al., Huntingtin encoded polyglutamine expansions form amyloid-like protein aggregates in vitro and in vivo. Cell 90, 549-558 (1997).
      60. E. Scherzinger et al., Self-assembly of polyglutamine-containing huntingtin fragments into amyloid-like fibrils: implications for Huntington's disease pathology. Proceedings of the National Academy of Sciences of the United States of America 96, 4604-4609 (1999).
      61. J. L. Wacker, M. H. Zareie, H. Fong, M. Sarikaya, P. J. Muchowski, Hsp70 and Hsp40 attenuate formation of spherical and annular polyglutamine oligomers by partitioning monomer. Nature Structural and Molecular Biology 11, 1215-1222 (2004).
      62. M. Jayaraman et al., Kinetically competing huntingtin aggregation pathways control amyloid polymorphism and properties. Biochemistry 51, 2706-2716 (2012).
      63. A. K. Thakur, W. Yang, R. Wetzel, Inhibition of polyglutamine aggregate cytotoxicity by a structure-based elongation inhibitor. FASEB J 18, 923-925 (2004).
      64. D. W. Colby et al., Potent inhibition of huntingtin aggregation and cytotoxicity by a disulfide bond-free single-domain intracellular antibody. Proceedings of the National Academy of Sciences of the United States of America 101, 17616-17621 (2004).
      65. S. Tam, R. Geller, C. Spiess, J. Frydman, The chaperonin TRiC controls polyglutamine aggregation and toxicity through subunit-specific interactions. Nature Cell Biology 8, 1155-1162 (2006).
      66. R. S. Atwal et al., Huntingtin has a membrane association signal that can modulate huntingtin aggregation, nuclear entry and toxicity. Human Molecular Genetics 16, 2600-2615 (2007).
      67. X. Gu et al., N17 Modifies Mutant Huntingtin Nuclear Pathogenesis and Severity of Disease in HD BAC Transgenic Mice. Neuron 85, 726-741 (2015).
      68. M. B. Veldman et al., The N17 domain mitigates nuclear toxicity in a novel zebrafish Huntington's disease model. Molecular neurodegeneration 10, 67 (2015).
      69. M. s. Anass Chiki et al., One -pot Semi synthesis Of Exon1 Of The Mutant Huntingtin Protein : An Important Advance Towards Elucidating The Molecular And Structural Determinants Of Huntingtin's. Health and Biomedical, 1047-1047 (2014).
      70. A. Chiki et al., Site-specific phosphorylation of Huntingtin exon 1 recombinant proteins enabled by the discovery of novel kinases. Chembiochem : a European journal of chemical biology 10.1002/cbic.202000508 (2020).
      71. A. Reif, A. Chiki, J. Ricci, H. A. Lashuel, Generation of native, untagged huntingtin Exon1 monomer and fibrils using a SUMO fusion strategy. Journal of Visualized Experiments 2018, 1-9 (2018).
      72. F. S. Ruggeri, T. Šneideris, M. Vendruscolo, T. P. J. Knowles, Atomic force microscopy for single molecule characterisation of protein aggregation. Arch Biochem Biophys 664, 134-148 (2019).
    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      The authors investigated the structural feature of N-terminal amino acid (Nt17) of Huntingtin, the gene product of Huntington disease. Nt17 was reported to play roles in modulating Huntingtin's aggregation, its life cycle, membrane binding and toxicity, however, those reports used tagged Nt17 and the authors thought the tags have the potential influence to the aggregation process and others and used tag-free Nt17-huntingtin exon1(Httex1) protein. Using Nt17 deleted Httex1 and mutant which disrupt helix conformation such as M8P, and phosphorylated Nt17, they found Nt17 sequence but not its helical conformation determined the morphology and growth of Httex1 fibrils in vitro. In cells, Nt17 sequence and its helical conformation influenced on aggregation propensity and toxic properties. Furthermore, the uptake o Httex1 into primary striatal neurons is influenced by the helical propensity of Nt17. They concluded Nt17 domain serves as the master regulator of Htt aggregation and toxicity. Their findings provided the precise information for the role of tag-free Nt17.

      Major concerns:

      1) Finding of lateral association(bundling) of ΔNt17-Httex1 fibrils is interesting. However, pathological significance is not clear. a) Does even non-truncated form also increase this kind of bundling when polyQ is expanded? b) When fibrils are added to striatal neurons like in Fig.5, is this structural feature preserved on the membrane or inside of the cells? c) When Httex1 fibrils species are expressed, is this bundling also observed? d) What function (cell death, membrane integrity or others) is most correlated with this structural feature?

      2) The authors claimed < we investigated for the first time, the role of the Nt17 sequence, PTMs and conformation in regulating the internalization and cell-to-cell propagation of monomeric and fibrillar forms of mutant Httex1.>. However, so far this reviewer understand, the authors studied the internalization but not cell-to-cell propagation.

      Minor points

      1) Fig.3B arrowhead could not be seen.

      2) Fig.4A: what do arrows mean? The insets are hard to identify. No scale bars?

      3) Fig.5A:no scale bars?

      4) Fig.S3. Height and length seem to be wrong.

      5) Fig.S6C: hard to compare. D:What is Htt2-90? Also in Fig.S13.

      6) There are many abbreviations difficult to understand in supplement. Fig.S1 Htt18-90(Q18C) etc.

      Significance

      The paper advanced our knowledge of Nt17, especially in the Huntington disease field. However, the conceptual advance is limited.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      This article describes the results from studies into mechanisms of the aggregation and toxicity of Htt Exon1 protein. The authors investigated role of N17, polyQ length, M9C mutation, and phosphorylation. Multiple approaches were used that included biochemical protein design, biophysical measurements, and cell biological experiments with cultured mammalian cells. The authors demonstrates effects of protein context on aggregation. Furthermore, the authors were able to visualize the aggregates in mammalian cells and in neurons using multiple methods. These are interesting data, but there are several major weaknesses in the study. First problem is that most of the results related to aggregation mechanisms and toxicity are not original and incremental when compared to many previously published articles. Moreover, there are several problems in interpretation of obtained data and in making conclusions. Some of the most critical problems are listed below.

      • The main hypothesis of this study solely depends on the ability of N17 domain to enhance aggregation (Fig 1 and Fig 2). According to the method for the protein solubilization 1mM TCEP was added to ∆Htt-Ex1, but not to Htt-Ex1 proteins. It is necessary to rule out potential effects of TCEP on aggregation assay.
      • The author needs to provide biophysical data of the mutation and phosphorylated proteins with/without Tag. As stated by the authors, even the slight change in a protein context could lead to unexpected changes in structural behavior of a protein. Thus, importance of Tag needs to be evaluated.
      • It is a challenging technical problem to produce proteins which are rich in Pro and Gln content. But there is not enough experimental details provided in the methods. Please add detailed procedures for expression and purification of these proteins.
      • Fig 8 the color codes for PolyQ and PolyP need to be corrected.

      Significance

      This article describes the results from studies into mechanisms of the aggregation and toxicity of Htt Exon1 protein. The authors investigated role of N17, polyQ length, M9C mutation, and phosphorylation. Multiple approaches were used that included biochemical protein design, biophysical measurements, and cell biological experiments with cultured mammalian cells. The authors demonstrates effects of protein context on aggregation. Furthermore, the authors were able to visualize the aggregates in mammalian cells and in neurons using multiple methods. These are interesting data, but there are several major weaknesses in the study. First problem is that most of the results related to aggregation mechanisms and toxicity are not original and incremental when compared to many previously published articles. Moreover, there are several problems in interpretation of obtained data and in making conclusions. Some of the most critical problems are listed below.

      • The main hypothesis of this study solely depends on the ability of N17 domain to enhance aggregation (Fig 1 and Fig 2). According to the method for the protein solubilization 1mM TCEP was added to ∆Htt-Ex1, but not to Htt-Ex1 proteins. It is necessary to rule out potential effects of TCEP on aggregation assay.
      • The author needs to provide biophysical data of the mutation and phosphorylated proteins with/without Tag. As stated by the authors, even the slight change in a protein context could lead to unexpected changes in structural behavior of a protein. Thus, importance of Tag needs to be evaluated.
      • It is a challenging technical problem to produce proteins which are rich in Pro and Gln content. But there is not enough experimental details provided in the methods. Please add detailed procedures for expression and purification of these proteins.
      • Fig 8 the color codes for PolyQ and PolyP need to be corrected.
    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      The manuscript by Vieweg, Mahul-Mellier, Ruggeri et al, describes the role of the sequence and conformation of the extreme N-terminus of the Huntingtin protein in terms of aggregation and toxicity together with its relation to the polyglutamine length. The authors use some outstanding methods to ensure that the conclusions are based on good quality data. The manuscript is generally well written, although it might benefit from reducing the length of the discussion section.

      Major points to address are:

      1. For their in vitro data, the authors do not go beyond 42 polyglutamines. Is there any particular reason for that? The authors see a clear difference between 36Q and 42Q, but although not critical, it would have been useful to use longer repeats. In my view, the authors should at least discuss the rationale for this, particularly as in cellular models they do use 72Q constructs.
      2. The role of the N-terminus 17 aminoacids of huntingtin (Nt17) is addressed by comparing peptides with and without the Nt17 and their relation to the adjacent polyglutamine tract. Using this approach, the peptide without the Nt17 is composed of pure polyglutamines in its N-terminus, followed by the rest of exon 1 in its C-terminus. This is clearly the key comparison to address the role of the Nt17 in the context of an exon1 containing polyQ. However, did the authors considered using other synthetic sequences at the Nt17 to further address the role of the N-terminal tail in the aggregation potential? Although this might not be critical, it could be a useful control to add. Although, admittedly, by using a mutant peptide (M8P) and phosphorylated forms, the authors are addressing the issue of sequence and/or conformation.

      Minor points:

      1. On page 6, the data on how the Nt17 domain affects Httex1 aggregation, the information on which figure it is referring to is missing.
      2. In Figure 1A, it is difficult to compare the data on Nt17 and Nt17, particularly for 36Q and 42Q, as the time axis are different. I understand that the kinetics are different, but particularly for the 42Q peptides (Nt17 and Nt17) as their kinetics are not that different, it may be useful to show them in the same panel.

      Significance

      Overall, this is an excellent study. My expertise is in genetics and molecular and cellular biology and have worked in HD research for more than 10 years. However, I am not a chemist, and therefore cannot comment about any possible limitations of some of the techniques involved.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      The authors do not wish to provide a response at this time.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #4

      Evidence, reproducibility and clarity

      Summary:

      The freshwater polyp Hydra possess the remarkable ability to regenerate a fully functional head within a few days after amputation, however when e.g., Notch signaling is inhibited the animals fail to regenerate the original head pattern. In the manuscript by Moneer et al. the authors aim to identify Notch responsive genes by RNA sequencing. 48 hours after Notch signaling inhibition with DAPT, 624 genes were up- and 207 genes downregulated. To identify putative direct Notch target genes the authors generated RNA-seq datasets at 3 and 6 hours after DAPT removal and propose that the expression of direct target genes is rapidly recovered within 3 hours as shown by the re-expression of HyHES. Furthermore, by performing motif enrichment analyses the authors propose that e.g., HyAlx and HySp5 could be direct Notch target genes.

      Major comments:

      1) It is not clear why the authors chose 48 hours as a time point for RNA sequencing. Why not 12 or 24 hours after DAPT exposure? Is the expression of HyHES or CnASH not downregulated at earlier time points? Furthermore, why did the authors use whole animals and not just the head tissue for RNA-seq to enrich the transcripts?

      2) Why did the authors not perform RNA sequencing on head regenerating DAPT-treated animals? This would help to better understand the relationship between Notch and Wnt signaling especially as the authors showed in 2013 (Mündner et al) that the expression of Wnt3 is strongly affected in head regenerating DAPT-treated animals.

      3) It is currently very difficult to fully evaluate the data. One single excel file with all up- and downregulated candidates should be provided (Trinity ID, fold change, False Discovery Rate, annotation etc.). I would have assumed that genes such as Wnt8 that are expressed at the base of the tentacles (Philipp et al., 2009) could be affected by DAPT. Is Wnt3 not affected at all in intact animals?

      4) The silencing of Sp5 induces the formation of ectopic heads in intact and regenerating conditions and it has clearly been shown that Sp5 inhibits Wnt/β-catenin signaling. To call Sp5 a tentacle patterning gene just based on the identification of RBPJ-motifs in the Sp5 regulatory region is misleading, as it is currently not supported by experimental data. The fact that a regulatory motif is present in a promoter region does not mean that this regulatory motif is active.

      5) This manuscript would be much more interesting and of greater importance if the authors would have added functional data for one or two candidate genes.

      Minor comments:

      1) Figure S1: Individual data points for the qPCR analysis should be shown and arrow bars added.

      2) Figure 6: Scale bars are missing.

      Significance

      The manuscript is well written, and the presented results could be of interest for the Hydra field but they will not have a broad impact in the present state. I find it unfortunate that the authors did not use the datasets produced to better understand the complex regulatory network that is active during the patterning of the Hydra head.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Moneer et al. studied Notch target genes in the context of nematogenesis, i.e. the generation of stinging cells (nematocytes) from interstitial stem cells (i-cells), and in axial patterning, in the cnidarian Hydra. They used the Notch pathway inhibitor DAPT, a drug acting on presenilin, preventing the release of Notch intracellular domain (NICD). Bottger's team pioneered the usage of DAPT in Hydra back in 2007 and it has been used successfully since then in other cnidarians too. The authors first exposed Hydra polyps to DAPT for 48 hours, followed by transcriptomic analysis to identify Notch responsive genes. They then analyzed gene expression at 3 and 6 hours after removal of DAPT to identify direct and indirect Notch targets, respectively. Using a recently published Hydra single-cell atlas, the authors report that most Notch responsive genes are expressed in the nematocyte lineage, consistent with the known role of Notch signaling in hydrozoan nematogenesis. They also identify Notch targets in epithelial cells, consistent with a role of the pathway in axis patterning.

      Overall, the manuscript is interesting, and the authors' conclusions are overall supported by the data. A strength of the paper is the good usage they make of a previously published Hydra single-cell transcriptome, which they do in collaboration with the Juliano lab who generated this data set. A weakness of the work is the dependence on Notch pharmacological inhibition and absence of genetic interference; the latter would provide evidence for specificity as opposed to phenotypes being a side effect of DAPT or high DMSO concentration (e.g. stress response, see specific point #6, below). The text reads well, and the figures are of good quality. Below is a list of points to be addressed.

      1) On p. 4, the authors state: "We identified 831 genes that were differentially expressed in response to 48 hours of DAPT treatments". This refers to genes differentially expressed at T0. Then, they check the expression of these genes at T3 and T6. Were all differentially expressed genes at T3 and T6 included in the 831 genes identified at T0? Did the authors find differentially expressed genes at T3 and T6 which are not differentially expressed at T0?

      2) p. 2, last paragraph: insert "the time points 3 and 6 hrs after DAPT removal" after "To characterize...". This is important to clarify that the analysis was done after removal rather than the addition of DAPT.

      3) The authors normalized the expression of genes of interest to several housekeeping genes (RPL13, SDH, EF1α, GAPDH, and Actin) in their qPCR analysis. In Fig. S1, however, only "control" is written. Did the authors merge all results from the different housekeeping genes, or did they use only one reference gene as control (which one?) to generate the figure?

      4) On Fig. 3 and the accompanying text on p 5, the black and grey clusters represent 90 and 80 genes, respectively. These 170 genes represent 25% of the total (170/666), not 20%. Clarify.

      5) The figure number of Figure S2 is not indicated in the figure.

      6) Can the authors confirm the DMSO concentration (1%)? I am aware this was the concentration used in their previous work, but it is nevertheless pretty high. High DMSO concentration could explain the stress response they observed.

      7) Figure 1: on the right, few letters are missing.

      8) Fig. 5B, remove lettering J,K,L from lower panel images.

      9) Figure number is absent in Figure 9.

      10) The authors completely ignore work on Notch signaling in other cnidarians. This not only impedes an evolutionary synthesis of the data but also leads to failure to discuss other functions Notch fulfills in cnidarian biology (e.g. immunity and regeneration).

      Significance

      The Notch pathway inhibitor, DAPT, has been widely used in work involving cnidarians. These studies have established a role for Notch in late-stage stinging cell differentiation and in tentacle morphogenesis in development and regeneration (Layden and Martindale, 2014; Marlow et al., 2012; Munder et al. 2013; Richards and Rentzsch, 2014, 2015; Gahan et al., 2017). It has also been shown that early-stage neurogenesis in hydrozoans is independent of Notch (Kasbauer et al. 2007; Gahan et al. 2017), which is different from bilaterian and anthozoan neurogenesis. What Moneer et al. did in the present study was to take these known phenotypes and put them in a cellular and molecular context. The results, showing that nematogenesis genes are Notch targets, are not surprising but novel. This work closes an existing knowledge gap and is important for the field.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      In the manuscript by Moneer et al. Notch target genes are defined in Hydra using a classical Gamma-secretase inhibition approach. Gene expression analyses is done at different time-points via RNAseq and combined with single-cell data and ATAC-seq data. This is further elaborated with exact expression analysis and experiments studying the wash-out (recovery) of the inhibitor and again gene expression profiling. Regarding the target genes identifies several new and interesting target genes. The downstream transcription factors Pou4F3 and Pax6 are very interesting and the Wnt-pathway regulators as well. This is way more convincing than the previously described cross-talks.

      My comments:

      1) Introduction (page-3): Only few direct target genes of Notch-signaling have been identified so far. I don't agree. By now, there are several studies in the mammalian system using ChIPseq with anti-RBPJ and GSI-studies and dnMAML followed by RNAseq. In addition, there is also genomic fairly good data using the Drosophila-system. (On the other hand, there is still a need to identify in better defined systems). Please correct and add additional references.

      2) Regarding Figure-2: How many genes are in each class? Are all the 624 genes downregulated after 48 hours of DAPT? (Part of these genes could still be direct Notch targets, possibly also harboring RBPJ binding motifs).

      3) Some of the genes in the mammalian systems do not appear in presented study in Hydra: What happens the feedback regulators Dtx and NRARP? Is the Hydra Notch-gene itself regulated? What about oncogene c-myc? (I assume that c-myc exists also in Hydra (?).

      4) Evolutionary conservation; (Regarding addition to Figure-9): For readers that are not so familiar with Hydra, it would be extremely helpful to have a summary-table (list) with conserved Notch target genes.

      5) Suggestion: I am not a Hydra-expert, but, if possible, experiments using inducible dominant-negative Mastermind (dnMAML) would strengthen this manuscript.

      Significance

      This study by Moneer et al. is a nice and thoroughly done study, which will further advance our understanding of Notch target genes. This is of interest of readers in signal transduction and developmental biology.

    5. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary

      The manuscript of Moneer et al describes RNAseq data on DAPT treated Hydra aiming to identify genes involved in the Notch pathway. The RNAseq data is compared with previously published Singe Cell Seq data. They proceed to perform hierarchical clustering, motif enrichment analysis of promoter regions and metagene analysis. The research provides a resource for other researchers that are interested in Notch signalling in Hydra.

      Major comments

      The research is very descriptive in nature. The RNAseq experiment is mostly well set up and analysed, however, the manuscript lacks subsequent experiments to confirm their findings or to determine the possible significance of the data. As a consequence, the authors are not able to draw clear conclusions from the data as most findings are only suggestive.

      The manuscript aims to identify Notch dependent molecular pathways. However, the authors find a lot of indirect targets and a lot of the analyses involve these targets. In comparison, the few potential direct targets, which should be the core of the manuscript, do not receive sufficient attention. The manuscript would be much more significant if the focus would be on the direct targets and would include experiments to determine if the suggestions the current data provides can be confirmed and expanded upon.

      Only two time points were used to establish which two time points were required to be able to differentiate between direct and indirect targets. This experiment requires more time points as well as several known direct and indirect targets as different targets will recover at different rates. Only then will the authors be able to determine whether they used the most appropriate time points.

      A significant number of the figures relies heavily on a previously published paper from the same group. The methods section lacks a description of the statistical analysis performed.

      Minor comments

      The title of the manuscript is too strong for the data provided.

      Although the introduction is well written, the results section lacks clarity and explanation. A concluding sentence at the end of each paragraph would aid the reader in analysing the significance of the findings. In results section 2 the authors mention the identification of 23 metagenes. A figure/table presenting this data would aid the presentation of this data. Fig 6 shows in situ hybridisation data that could potentially be interesting, however, the authors could add some more information to link this data to the Notch pathway.

      In Fig S1 information about the control is lacking. Fig S3 shows alignments and phylogenetic trees but it is not clear what the function is of this figure. Some additional information explaining the relevance of the data would improve the manuscript.

      In the methods section additional information regarding the set up and analysis of the qPCR is required (see MIQE guidelines). This includes further information on how the primers were tested.

      Several of the figures use colour coding but some of these are not defined in the legends. Some of the figures/tables use abbreviations that are not defined. References are split between the regular reference list and a separate list in table S2. There appear to be very few recent references.

      Significance

      The manuscript provides a potential resource for further research. It might be relevant to researchers interested in Notch signalling and/or Hydra as a model organism/evolutionary studies. The data is mostly descriptive in nature. To date Notch signalling in Hydra has not received a lot of attention in the existing literature. The reviewer's area of expertise is Notch signalling in development. The reviewer is not familiar with Hydra as a model system.

  3. Mar 2021
    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      We thank all three reviewers for the very positive response to our paper. Only minor revisions were suggested which have all been incorporated.

      Reviewer #1

      We added missing taxonomic names and labels in Figure 6A and improved the punctuation throughout the manuscript.

      Reviewer #2

      As the reviewer suggested we added a schematic representation (Figure 11) depicting the two scenarios, which explain the evolution of DV patterning.

      Reviewer #3

      We did the small textual corrections suggested by the reviewer.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Summary

      This manuscript continues a series of beautiful papers from Roth, Pechman, Lynch and colleagues analysing D/V patterning in a range of insects. The work started with Drosophila and has extended to other holometabolous and now hemimetabolous insect species.

      This paper is in many ways one of the most remarkable of the series, for it shows that the mechanisms of D/V patterning in the cricket Gryllus are, in several striking respects, very similar to those known from Drosophila - much more so than in some of the other insects studied to date, even though Gryllus is phylogenetically the most distant from Drosophila.

      Specifically, the authors present compelling data to show that the roles of Toll and dpp, as inferred from their knockdown phenotypes, are remarkably similar in Gryllus and Drosophila. This is very different from the consequences of toll and dpp knockdown in the hemipteran Oncopeltus, a species which almost certainly shares a more recent common ancestor with Drosophila.

      The discussion, after summarising the results, addresses the interpretation of this surprising observation. The authors favour the hypothesis that the similarity between Drosophila and Gryllus is the result of convergence in the roles and regulation of Toll and dpp signalling, rather than an ancestral trait that has been lost to a greater or lesser extent in Oncopeltus, and in the two other insects previously studied. The argument for this interpretation is carefully made, on the basis of a thorough knowledge of the comparative embryological literature (including highly relevant recent work).

      Major comments

      The work depends on an analysis of candidate genes, not de novo functional searches. However, it builds on the well established understanding of the relevant genetic machinery in Drosophila, and on extensive knowledge of the genome and transcriptome of Gryllus, a dataset that has been substantially extended by new work reported in this paper, on ovary and embryonic transcriptomes. These data are sufficiently complete to give confidence that all orthologues of most of the known candidate genes have been identified, and to highlight the apparent absence from the Gryllus genome of any sog/chordin orthologue - a key dpp inhibitor widely involved in D/v patterning.

      The embryology is beautifully described. The early stages of these very yolky eggs are not easy to handle, but the stainings reported here are almost all of high quality, as are the movies of live development using a nuclear GFP marked line.

      The gene knockdowns appear to have been carried out carefully with due regard for the potential biases caused by sterility following parental RNAi. Phenotypes have been documented effectively by the expression of marker genes in fixed embryos, and by live imaging of development in knockdown embryos. Tables in the supplementary data show that sufficient numbers have been obtained. The work is carefully interpreted, and where inferences are less than certain, they are carefully phrased.

      I find the results convincing, and therefore accept the conclusion of fundamental similarity between the roles of Toll and dpp in Drosophila and Gryllus.

      Time will tell whether or not the authors favoured interpretation of these data as convergent is correct, but I certainly believe that the argument as here presented in the discussion is appropriate for publication in its current form. The abstract is, appropriately, more non-committal than the discussion itself on the interpretation of these results.

      The paper is well written.

      Minor points

      Videos - please state orientation of the embryos, especially in videos 2 &4

      Page 23 bottom "The early dorsal-to-ventral gradient of pMad (Figure 5AB) indicates that BMP signalling plays an important role ...." suggests would be better than indicates here, until functional data is considered.

      Significance

      The gene networks mediating patterning of the D/V body axis are related across the whole range of animals, with in particular the involvement of TGFb/dpp signalling being almost universal in this process. However, there are a great many variations on this theme. Even within the insects, the mechanisms that have been described for establishing localised TGFb and Toll signalling span the range from self organisation to effective maternal prelocalisation. This has made the GRN underlying D/V patterning a key model for studies of the evolution of gene regulatory networks.

      This paper adds an interesting and important twist to the story. It is certainly not the result that any of us would have expected, based on prior published work from Oncopeltus.

      If indeed it does turn out to be a case of convergence, a more detailed mechanistic analysis of that convergence will provide considerable insight into the reproducibility of evolution.

      Other published work: There is no comparable work on D/V patterning in any other polyneopteran insect, to my knowledge.

      Audience: Insect developmental biologists, evolutionary developmental biologists and others interested in the evolution of gene regulatory networks.

      My expertise: Arthropod embryology, axial patterning, evolutionary developmental biology.

      I have not reviewed in detail the presentation of the transcriptomic data and the phylogenetic analysis of gene sequences as presented in the supplementary info.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      In this paper Pechmann and colleagues investigate the molecular mechanisms of dorso-ventral patterning in Gryllus bimaculatus. As a basis for their study they carry out thorough RNAseq analyses of various embryonic stages. Gryllus is a member of the hemimetabolous insects and therefore of interest for comparison with holometabolous insects such as Drosophila, Tribolium and Nasonia. Previous work has shown that there are significant differences in the use of Toll and Sog in establishing the dorso-ventral gradient of BMP signaling among Drosophila and Nasonia. Pechmann et al find that in Gryllus Toll has a similar role as in Drosophila and is regulated via Pipe, so far only found in Drosophila. Furthermore, they show by RNAi knockdown studies that loss of BMP signaling has little impact on the differentiation of mesoderm in Gryllus, like in Drosophila, hence, BMP signaling has largely a role in dorsal fates. Ventral fates are under direct control of the Toll gradient. Surprisingly, they also find that the key antagonist of BMP signaling and shuttle for BMPs, Sog, has been lost in Ensifera, the lineage leading to Gryllus.

      This is a thorough and detailed study involving a series of functional experiments, which highlights the flexibility and evolvability of GRN of the dorso-ventral body axis formation in insects. The major finding that Gryllus is more similar to Drosophila than is Nasonia and Tribolium is interesting and even somewhat unexpected, since Drosophila is often regarded as the derived odd ball. The authors discuss two obvious explanations: the situation found in Gryllus and Drosophila reflects the ancestral condition, or, alternatively, it is the result of convergent evolution. They tend to favor the latter hypothesis. This study is an important advancement to our understanding, as it shows the constraints and the evolvability of a key patterning system to establish a body axis.

      Even though the authors show nicely that Toll signaling is required to establish the BMP signaling gradient, the loss of Sog in Gryllus leaves the question unanswered how the long range BMP gradient and its shape is established. In Drosophila and vertebrates, Sog/Chordin acts both as an antagonist close to its source and as a shuttling factor, promoting BMP signaling at a distance, which is crucially important for the long range and the shape of the BMP signaling gradient. It would be desirable to test the function of other potential BMP antagonists (follistatin, gremlin, noggin) or competing BMPs (BMP3, ADAMP) in this context.

      As a minor suggestion, I would recommend to summarize the findings in a synthetic picture depicting the evolutionary scenarios of the two hypotheses.

      Significance:

      This study is an important advancement to our understanding, as it shows the constraints and the evolvability of a key patterning system to establish a body axis.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary:

      • The authors have carried out an extensive survey of dorso-ventral axis determination in the cricket Gryllus bimaculatus. They did this through analysing and knocking down key components of the two main pathways involved in D/V patterning, the toll pathway and BMP signalling. This analysis was placed in a comparative context, looking at published data on four other insect species, with the aim of contributing to our understanding of the evolution of D/V patterning.
      • The authors find significant similarities between D/V patterning in Gryllus and in Drosophila - These similarities are both in the relative contributions of toll and BMP to D/V polarization and in the early ovarian activation of the toll pathway. Despite these similarities, a closer analyses of the molecular interactions uncovers some significant differences, most notably, the absence of several key modulators of BMP activity. These results lead the authors to conclude that the similarities in D/V patterning between Gryllus and Drosophila are due to convergence and not due to the conservation in Drosophila of an ancestral patterning mechanism that has been lost in almost all other lineages studied.

      Major comments:

      • All in all this is an excellent paper. There is a huge amount of data in here, and everything is done very meticulously and carefully. There is a good balance between mostly descriptive work (gene expression patterns, cell movements in WT embryos) and experimental work. I could find no obvious flaws with any of the results or methods, and I think the authors have made a convincing case to support their conclusions, without being too dogmatic.
      • I don't see a need for any additional experiments beyond what the authors have done. They have covered all relevant aspects of D/V patterning, and make a convincing case with the data they have.

      Minor comments:

      The few comments I have are very minor and technical: -Missing taxonomic names (families) in Fig. 1

      • Missing label in Fig. 6 Panel A.
      • Punctuation could be improved. There are several instances of missing commas, and other places with unnecessary commas.

      Significance:

      • The manuscript represents an admirable amount of work. One can say that in a single paper, the authors have provided nearly as much information about Gryllus D/V patterning as is available for other "second-order" insect model species such as Oncopeltus or Nasonia. A such, it provides an additional major phylogenetic anchor point for understanding the evolution of early patterning.
      • In terms of significance to advancing our knowledge, the data in the manuscript is, as stated above, an anchor point. It does not on its own provide any major novel insight, but fits into an ever-expanding body of comparative knowledge, whose importance is greater than the sum of its parts. Perhaps the most interesting conclusion, is indeed the one the authors have chosen as the selling-point of their paper, the fact that there is functional convergence in certain aspects of D/V patterning between two widely diverged insect species, with very different oogenesis and early development. This is again, not a major advance on its own, but an important additional piece of the comparative picture of early insect development.
      • This paper will be of significant interest to the research community of comparative insect development (the community to which this reviewer belongs). It will also be of interest to those interested in examples of convergence at the functional and molecular level, to those interested in the evolution of gene families and to those interested specifically in the signalling pathways discussed (even in a non-comparative context).
    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Author responses are written in bold and are italicized. We have underlined the important points in the reviewer's comments. All responses have been read and authorized by all authors of this manuscript. Authors would like to thank the reviewers and the editor for their valuable time. We believe that the comments and suggestions from both reviewers will significantly improve SMorph and the manuscript.

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      First of all, I want to apologize the authors and editor for my delay. Secondly, for clarity, I want to disclose that I am the author of the Fiji's 'Sholl Analysis' plugin, that the authors cite extensively (Ferreira et al, Nat Methods, 2014).

      In this study, Sethi et al introduce a software tool - SMorph - for bulk morphometric analysis of neurons and glia (astrocytes and microglia), based on the Sholl technique. The authors compare it to the state-of-the-art in a series of validation experiments (stab wound injury), to conclude that it is 1000 times faster that existing tools. Empowered by the tool, the authors show that chronic administration of a tricyclic antidepressant (DMI) leads to structural changes of astrocytes in the mouse hippocampus. The paper is well written, the description of the tool is clear, and the authors make all of the source code available, as well as most of the imagery analyzed in the manuscript. The latter on its own, makes me really appreciative of the authors work.

      We thank reviewer #1 for their careful reading of the manuscript and their comments.

      Major comments:

      A major strength of SMorph is that it leverages the Python ecosystem, which allow the authors take advantage of powerful python packages such as sklearn, without the need for external packages or tools. However, I have strong criticisms for the claims that are made in terms of speed and broad-applicability of the software, including PCA.

      Speed:

      The 1000x speed gains, assumes - for the most part -- <u>that the processing in Fiji cannot be automated</u>. This is false. I read the source code of SMorph, and with exception of the PCA analysis, all aspects of SMorph can be automated in Fiji, using any of Fiji's scripting languages to make direct calls to the Fiji and Sholl Analysis plugin APIs (See https://javadoc.scijava.org/) . Now, perhaps the authors do not have experience with ImageJ scripting, or perhaps we Fiji developers failed to provide clear tutorials and examples on how to do so. Or perhaps, there is something inherently cumbersome with Fiji scripting that makes this hard (e.g., there is a current limitation with the ImageJ2 version of 'Sholl Analysis' that does not make it macro recordable). It such limitations do exist, it is perfectly fine to mention them, but do contact us at https://forum.image.sc, if something is unclear. We do strive to make our work as re-usable as possible. Unfortunately our own research does not always allow us the time required to do so. Case in point, our scripting examples (e.g., https://github.com/tferr/ASA/blob/master/scripting-examples/3D_Analysis_ImageStack.py; https://github.com/tferr/ASA/blob/master/scripting-examples/3D_Analysis_ImageStack.py) are not well advertised. <u>That being said, I am still surprised that in their side-by-side comparisons the authors were not able to automate more the processing steps</u> (e.g., the ImageJ1 version of 'Sholl Analysis' remains fully functional and is macro recordable). If I misunderstood what was done, please provide the ImageJ macros you used. Also, I wanted to mention that i) semi-manual tracing with Simple Neurite Tracer (now "SNT"), can also be scripted (see https://doi.org/10.1101/2020.07.13.179325); and that ii) Fiji commands and plugins can also be called in native python using pyimagej (https://pypi.org/project/pyimagej/), see e.g., https://github.com/morphonets/SNT/tree/master/notebooks#snt-notebooks). Arguably, the fact that SMorph handles blob detection and skeletonization-based metrics directly is more advantageous from a user point of view. In Fiji, blob detection, skeletonization and Strahler analysis (https://imagej.net/Strahler_Analysis) of the skeleton are handled by different plugins. However, those are also fully scriptable, and interoperate well. The point that topographic skeletonization in Fiji can originate loops is valid, however the authors should know that such cycles can be detected and pruned programmatically using e.g., pixel intensities (see https://imagej.net/AnalyzeSkeleton.html#Loop_detection_and_pruning and the original publication (https://pubmed.ncbi.nlm.nih.gov/20232465/)

      We completely agree with the reviewer’s assertion that most parts of the functionality of SMorph can be automated within imageJ as well, and in such comparison, the speed gains with SMorph will not be >1000X.

      However, automating the analysis in imageJ is beyond the scope of the present manuscript. In fact, imageJ analysis comparison was not a part of our original manuscript at all. Upon presubmission inquiry to one of the affiliate journals of Review Commons, we were specifically asked to include a side-by-side comparison with <u>“already available”</u> methods. So, we decided to use ImageJ as it is, and automation, if any, was limited to simple macros to run a series of commands sequentially on batches of images. Although it is true that this analysis could be done much more efficiently with additional scripting, it would not have met the definition of “already available” tools. The imageJ analysis was performed in a way an average biologist with no programming experience would perform it, since that group will find SMorph most useful. In no way do we intend to imply that imageJ analysis can’t be made more efficient and automated. Perhaps it was not clear from the way the text was framed in the initial version of the manuscript. We will add additional text to make this point clearer.

      On a side-note, in response to reviewer #2’s comments, we will perform the speed comparison on a per-image basis, so the speed gain (1080X) may change a little in the new comparison.

      Broad applicability:

      In our work, we made a significant effort to ensure that automated Sholl could be performed on any cell type: e.g., By supporting 2D and 3D images, by allowing repeated measures at each sampled distance, and by improving curve fitting. For linear profiles, we implemented the ability to perform <u>polynomial fits of arbitrary degree, and implemented heuristics for 'best degree' determination</u>. For normalized profiles, we implemented several normalizers, and alternatives for determining regression coefficients. We did not tackle segmentation of images directly (we did provide some accompanying scripts to aid users, see e.g. https://imagej.net/BAR) because in our case that is handled directly by ImageJ and Fiji's large collection of plugins. However, <u>in SMorph, several of these parameters are hard-wired in the code</u>. They may be suitable to the analyzed images, but they can be hardly generalized to other datasets. In detail: In terms of segmentation, SMorph is restricted to 2D images, scales data to a fixed 98 percentile, and uses a fixed auto-threshold method (Otsu). These settings are tethered to the authors imagery. They will give ill results for someone else using a different imaging setup, or staining method. In terms of curve fitting, the polynomial regression seems to be fixed at a 3rd order polynomial, which will not be suitable to different cell types (not even to all cells of 'radial morphology').

      We have indeed hard-coded the parameters that the reviewer mentions, and we agree that we can perhaps give all options to the end-users to choose from. The decision was made to hard-code the parameters so that SMorph becomes very easy and minimalistic to use for the end-users. But the reviewer is right to point out that this may compromise the broad applicability and accuracy. We will update the code in the revised version of the manuscript to give the users control over choosing these parameters.

      PCA:

      <u>The idea of making PCA analysis of Sholl-based morphometry accessible to a broader user base has merit and is welcomed</u>. However, it has to be done carefully in a <u>self-critic manner as opposed to a black-box solution</u>. E.g., in the text it is mentioned that 2 principal components are used, in the tutorial notebook, 3. <u>Why not provide intuitive scree plots that empower users with the ability to criticize choice?</u> Also, it would be useful for users to understand which metrics correlate with each other, and their variable weights.

      Reviewer #1’s suggestions would indeed make the PCA analysis more useful to the users. In the revised version of the code, we will provide additional data/plots to the user for making an informed choice of the significant principal components e.g. the elbow method, Ogive or Pareto plots, variable weights of different features in the principal components and correlation/covariance matrices.

      When we showcased the utility of PCA to distinguish closely related morphology groups (as in Type-1 and Type-2 PV neurons), we had been unable to base the distinction on individual metrics, at least not in a robust manner (see Fig. S4 in Ferreira et al, 2014). <u>A minor conundrum of the paper, is that it does not directly highlight the advantages of "analyzes in a multidimensional space"</u>. The differences between groups in the stab wound and DMI assays are such, that PCA is hardly needed: I.e., the differences depicted Fig2F,G are already significant, and already convey changes in "size and branch complexity" (as per PC1). The same argument applies to Fig. 5. The paper would profit from having this discussed.

      PCA data indeed is not required to make any of the inferences we make in the paper and is superfluous. However, as mentioned in the discussion section of this manuscript, the low-dimensional PCA data can be used in future for other applications, e.g to cluster the astrocytes into morphometrically-defined subpopulations. SMorph can be further developed to perform real-time classification of these cells into morphometric clusters, which will allow the researchers to investigate clusters-specific gene expression, electrophysiology etc. Preliminary results from our lab do suggest that such clusters are differentially altered by stress and antidepressant treatments. However, these results are preliminary and are a part of a long-term future study. The data is really premature to publish at this stage, since it will require a lot of experimentation to show that these astrocyte subpopulations are indeed physiologically and functionally different. Nevertheless, we think that the utility of SMorph for such analyses may help others to come up with additional innovative ways to use the PCA data. Hence, we do believe that the community will benefit from the current release of SMorph having PCA. PCA data was shown in the figures just to demonstrate the functionality of SMorph. We will add additional text to make these points clearer.

      Other:

      • All metrics and parameters should be expressed in physical units (e.g.," radii increasing by 3 pixels", axes in Figure 2, 3, 5, S2) so that readers can directly interpret them.

      In the revised manuscript, we will convert all units into actual physical distances.

      We thank the reviewer for suggesting this paper. We will include this in the discussion of the manuscript.

      Minor comments:

      • Usage of RGB images (8-bit per channel) seems hardly justifiable. Aren't you loosing dynamic range of GFAP signal?

      We agree that we could have captured the images at a higher dynamic range. However, for the changes we observe between treatment groups using GFAP immunoreactivity signal as presented in the manuscript, we do not see an advantage of using higher dynamic range. However, as the reviewer rightly pointed out, under certain conditions, imaging using a higher dynamic range may help and hence, we will include this recommendation in the materials and methods section.

      • Please explain how MaxAbsScaler "prevents sub-optimal results"

      Since morphometric features extracted from cell images either have different units or are scalar, we had to perform normalization before PCA. We will add further explanation in the methods section of the manuscript.

      • The fact that automated batch processing can stall on a single bad 'contrast ratio' image seems rather cumbersome to deal with

      This problem has been resolved in the current version of SMorph, which will be uploaded with the revised version of the manuscript.

      We will add a GPLv3 license

      • "mounted on stereotax" should be "mounted on a stereotaxis device"?

      We will make this change

      • Ensure Schoenen is capitalized

      We will make this change

      Reviewer #1 (Significance):

      <u>I find the Desipramine results interesting</u>. However, given the existing claims that DMI can modulate LTP, I regret that the authors did not look at <u>structural modifications in hippocampal neurons</u> (e.g., by performing the experiments in Thy1-M-eGFP animals). I understand, that doing so at this point would be a large undertaking.

      Another manuscript from our lab1, as well as work from other labs have shown that stress causes significant degenerative changes in hippocampal astrocytes2,3. In the light of these observations, we do believe that our observation of chronic antidepressant treatment inducing structural plasticity in astrocytes is significant. Structural alterations in neurons after DMI treatment are of interest. But in our experience, we have not seen gross morphological (dendritic arborization) changes in hippocampal neurons as a result of antidepressant drug treatments. Such changes are restricted to spine morphology and axonal varicosities, which is beyond the capabilities of SMorph.

      Reviewer #2 (Evidence, reproducibility and clarity):

      This paper addresses the challenge of automatic Sholl analysis of large dataset of multiple cell types such as neurons, astrocytes and microglia. <u>The developed approach should improve the speed of morphology analysis compared to the state of the art without compromising on the accuracy</u>. The authors present an interesting application of their tool to the morphological analysis of astrocytes following chronic antidepressant treatment. The paper is well written, and the tool presented could be <u>beneficial for different applications and context</u>. However, some major aspects should be addressed by the author concerning the description of the algorithms used and the quantification of the results.

      We thank reviewer #2 for their careful reading of the paper and their comments.

      Major comments/Questions:

      1. In the Results and/or Methods sections, the author should better describe how their approach is different from state-of-the-art approaches in terms of algorithms used and how these difference impacts on the speed and accuracy of the analysis.

      We will add these descriptions in the methods section in response to this comment as well as some comments from reviewer #1.

      1. Imaging was performed on a Zeiss LSM 880 airyscan confocal microscope. Is this method robust to other types of imaging techniques, other microscopes, variable levels of signal-to-noise? This should be tested and quantified.

      We will demonstrate the results obtained from images taken using different microscopes and imaging techniques, and quantify the outcome.

      1. Manual cropping of the cells with ImageJ was used. However, in the methods section, the authors mention that other machine learning tools could be used for this task. Why were these tools not implemented in this paper in order to propose a fully automated analysis approach in combination with SMorph?

      We have tried both the machine learning tools cited in this paper (one for DAB images and other for confocal images). However, in our experience, we do not get robust performance from these tools with our datasets, and these tools will perhaps need more optimization for broad applicability. We are developing an auto-cropping tool in-house, but that is beyond the scope of the current study. Another point is that these tools are tailor-made for astrocytes, and their integration into SMorph will restrict its applicability to just one cell type.

      1. In the methods section you state that cropped cells need to have a good contrast ratio for automated batch processing. Could you define what a good contrast ratio is and characterize the performance of your approach for different contrast ratio?

      In the revised manuscript, we will compare the images taken from multiple microscopes and quantify the outcome. We will change the text accordingly. As such, the comment on rejected cells referred to really poor quality images. In the revised manuscript, we will make specific recommendations on imaging parameters so that this should not be an issue at all.

      1. It is mentioned that the analysis routine can be interupted by a cell with lower contrast ratio. This is a major drawback of the approach (but I think that it could be easily improved), as such interruptions may not be= practicable for many applications that need to rely on automated processing.

      We have already rectified this problem and the updated version of SMorph will be uploaded with the revised manuscript.

      1. Also, you should precise how the contrast ratio should be enhanced without modifying raw data in order to be processed with your approach. You suggest removing cells with lower contrast ratio from the analysis, but can this impact on the findings especially if some treatments impact on the detected fluorescence signal? Can you propose ways to improve the robustness of your approach to variable signal ratios?

      It is indeed possible that removing cells from analysis, may in certain cases, affect the results. To rectify this, we are testing the method on images obtained from different microscopes and under different imaging conditions. From these analyses, we will deduce minimum recommendations for imaging conditions so that images don’t have to be edited/altogether removed from analysis for the software to work. In the materials and methods section, we will add these recommendations to the users on the optimal range of imaging parameters. This way, rejection/modification of images should not be an issue.

      1. In the Results section, you describe the time necessary to perform different analysis. However, giving a total time in hours is not very informative as this will likely vary a lot depending on the size of the dataset, complexity of the images, etc. You should compare the average time per image for both methods and types of analysis.

      We compared the total time required for the entire dataset, since SMorph is meant for batch-processing all the images at once. However, we can change the comparisons to time taken per image. We can divide the total time taken by SMorph by the number of images analysed. However, in our opinion, the time taken to initiate SMorph will make these comparisons inaccurate.

      1. You state that for the number of branch point, the lower value of the measured slope when comparing SMorph and ImageJ was related to a constant overestimation of this parameter with ImageJ. How was this quantified? I think you should stress out more the comparison of both approaches with the manually annotated dataset.

      In the revised version of this manuscript, we will include some examples of skeletonized images that overestimate the number of forks. We have observed this to be a recurring problem with the skeletonization tools we have tried in imageJ. This can be rectified in imageJ itself as pointed out by reviewer #1. However, that’s beyond the scope of the present study and will not fit the definition of comparison with “already available” methods.

      1. How can you explain the differences in the 2D-projected Area, total skeleton length and convex hull between SMorph and ImageJ, which all show a slope around 0.83? Can you quantify the performance of both methods by comparing them with your manually annotated dataset?

      In the revised version, we will include the correlation data between completely manual and SMorph comparisons. We will discuss these comparisons further in the manuscript and make specific conclusions about the accuracy.

      1. In the introduction and discussion, you mention that you present a method that works on neurons, astrocytes and microglia. However, I don't see in the paper the comparison between the accuracy for all these cell types as you seem to have analyzed only the morphology of astrocytes.

      In the revised manuscript, we will include the Sholl analysis comparison (imageJ vs SMorph) from images of neurons and microglia.

      1. You mention that your method is quite sensitive to variation in contrast ratio. You should quantify the contrast ratio throughout the experiments and ensure that this is not biasing the SMorph analysis for some of the treatments.

      We thank both reviewers for highlighting this issue in the initial version of SMorph. As mentioned in our response to point #6, we will perform additional analyses to make specific recommendations to the end users regarding imaging parameters so that SMorph can work on images as they are. As such, our comments on contrast ratio applied only to very poor quality images. If images are acquired conforming to the imaging parameters we will recommend in the revised manuscript, images can be analysed without any issues.

      Minor Points :

      1. Precise the exact inclusion and exclusion criteria for Soma detection and rephrase: "The high-intensity blobs were detected as a position of soma..." & "Boundary blobs coming from adjacent cells...".

      We will add a complete explanation of blob detection and the exclusion criterion in the methods section.

      1. Throughout the text, make sure to always refer to an analysis time per image or per cell and not only include absolute duration values without reference to the task at hand (e.g. in the discussion : SMorph took 40 second to complete the analysis... please state to which analysis you are exactly referring to and if applicable if it varies from cell to cell).

      We will change all comparisons to time taken per cell. Text will be added to mention which datasets were used when any claims of speed are made.

      1. When you state in the discussion that "Although some methods do allow Sholl analysis without manual neurite tracing, they still work on one cell at a time", please precise if the only aspect that is missing from this type of analysis is batch processing (looping through the data) or if there is a major obstacle to automate this technique. This is important a SMorph does proceed with the analysis one cell at a time but can work in a loop/batch.

      We will elaborate further on our assertion regarding the challenges of using imageJ plugins for sholl analysis in large batches of cells.

      Reviewer #2 (Significance):

      <u>This tool could very useful to researchers in the field of cellular neuroscience working with high-throughput analysis of microscopy data</u>. The authors show some interesting improvements over existing methods. An improved quantitative characterization of the robustness of their approach would be of great importance to ensure the significance of this tool to a large community of researchers using different types of microscopes or studying different cell types.

      My expertise is in the field of optical microscopy and high-throughput (automated) image analysis for neuroscience. My expertise to evaluate the biological findings in this study is very limited.

      We thank reviewer #2 for their careful reading of the manuscript and their insightful comments. Growing evidence (clinical and preclinical) shows a significant reduction in astrocyte density in key limbic brain regions as a result of depression. We believe that the structural plasticity induced by chronic antidepressant treatment, as demonstrated in this manuscript, is an interesting novel plasticity mechanism that can negate deleterious effects of stress on astrocytes.

      The improvements suggested by both reviewers will help us to greatly improve SMorph in the revised version of this manuscript.

      References:

      1. Virmani, G., D’almeida, P., Nandi, A. & Marathe, S. Subfield-specific Effects of Chronic Mild Unpredictable Stress on Hippocampal Astrocytes. doi:10.1101/2020.02.07.938472.

      2. Czéh, B., Simon, M., Schmelting, B., Hiemke, C. & Fuchs, E. Astroglial plasticity in the hippocampus is affected by chronic psychosocial stress and concomitant fluoxetine treatment. Neuropsychopharmacology 31, 1616–1626 (2006).

      3. Musholt, K. et al. Neonatal separation stress reduces glial fibrillary acidic protein- and S100beta-immunoreactive astrocytes in the rat medial precentral cortex. Dev. Neurobiol. 69, 203–211 (2009).

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      This paper addresses the challenge of automatic Sholl analysis of large dataset of multiple cell types such as neurons, astrocytes and microglia. The developed approach should improve the speed of morphology analysis compared to the state of the art without compromising on the accuracy. The authors present an interesting application of their tool to the morphological analysis of astrocytes following chronic antidepressant treatment. The paper is well written, and the tool presented could be beneficial for different applications and context. However, some major aspects should be addressed by the author concerning the description of the algorithms used and the quantification of the results.

      Major comments/Questions:

      1. In the Results and/or Methods sections, the author should better describe how their approach is different from state-of-the-art approaches in terms of algorithms used and how these difference impacts on the speed and accuracy of the analysis.
      2. Imaging was performed on a Zeiss LSM 880 airyscan confocal microscope. Is this method robust to other types of imaging techniques, other microscopes, variable levels of signal-to-noise? This should be tested and quantified.
      3. Manual cropping of the cells with ImageJ was used. However, in the methods section, the authors mention that other machine learning tools could be used for this task. Why were these tools not implemented in this paper in order to propose a fully automated analysis approach in combination with SMorph?
      4. In the methods section you state that cropped cells need to have a good contrast ratio for automated batch processing. Could you define what a good contrast ratio is and characterize the performance of your approach for different contrast ratio?
      5. It is mentioned that the analysis routine can be interupted by a cell with lower contrast ratio. This is a major drawback of the approach (but I think that it could be easily improved), as such interruptions may not be= practicable for many applications that need to rely on automated processing.
      6. Also, you should precise how the contrast ratio should be enhanced without modifying raw data in order to be processed with your approach. You suggest removing cells with lower contrast ratio from the analysis, but can this impact on the findings especially if some treatments impact on the detected fluorescence signal? Can you propose ways to improve the robustness of your approach to variable signal ratios?
      7. In the Results section, you describe the time necessary to perform different analysis. However, giving a total time in hours is not very informative as this will likely vary a lot depending on the size of the dataset, complexity of the images, etc. You should compare the average time per image for both methods and types of analysis.
      8. You state that for the number of branch point, the lower value of the measured slope when comparing SMorph and ImageJ was related to a constant overestimation of this parameter with ImageJ. How was this quantified? I think you should stress out more the comparison of both approaches with the manually annotated dataset.
      9. How can you explain the differences in the 2D-projected Area, total skeleton length and convex hull between SMorph and ImageJ, which all show a slope around 0.83? Can you quantify the performance of both methods by comparing them with your manually annotated dataset?
      10. In the introduction and discussion, you mention that you present a method that works on neurons, astrocytes and microglia. However, I don't see in the paper the comparison between the accuracy for all these cell types as you seem to have analyzed only the morphology of astrocytes.
      11. You mention that your method is quite sensitive to variation in contrast ratio. You should quantify the contrast ratio throughout the experiments and ensure that this is not biasing the SMorph analysis for some of the treatments.

      Minor Points :

      1. Precise the exact inclusion and exclusion criteria for Soma detection and rephrase: "The high-intensity blobs were detected as a position of soma..." & "Boundary blobs coming from adjacent cells...".
      2. Throughout the text, make sure to always refer to an analysis time per image or per cell and not only include absolute duration values without reference to the task at hand (e.g. in the discussion : SMorph took 40 second to complete the analysis... please state to which analysis you are exactly referring to and if applicable if it varies from cell to cell).
      3. When you state in the discussion that "Although some methods do allow Sholl analysis without manual neurite tracing, they still work on one cell at a time", please precise if the only aspect that is missing from this type of analysis is batch processing (looping through the data) or if there is a major obstacle to automate this technique. This is important a SMorph do proceed with the analysis one cell at a time but can work in a loop/batch.

      Significance

      This tool could very useful to researchers in the field of cellular neuroscience working with high-throughput analysis of microscopy data. The authors show some interesting improvements over existing methods. An improved quantitative characterization of the robustness of their approach would be of great importance to ensure the significance of this tool to a large community of researchers using different types of microscopes or studying different cell types.

      My expertise is in the field of optical microscopy and high-throughput (automated) image analysis for neuroscience. My expertise to evaluate the biological findings in this study is very limited.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      First of all, I want to apologize the authors and editor for my delay. Secondly, for clarity, I want to disclose that I am the author of the Fiji's 'Sholl Analysis' plugin, that the authors cite extensively (Ferreira et al, Nat Methods, 2014).

      In this study, Sethi et al introduce a software tool - SMorph - for bulk morphometric analysis of neurons and glia (astrocytes and microglia), based on the Sholl technique. The authors compare it to the state-of-the-art in a series of validation experiments (stab wound injury), to conclude that it is 1000 times faster that existing tools. Empowered by the tool, the authors show that chronic administration of a tricyclic antidepressant (DMI) leads to structural changes of astrocytes in the mouse hippocampus. The paper is well written, the description of the tool is clear, and the authors make all of the source code available, as well as most of the imagery analyzed in the manuscript. The latter on its own, makes me really appreciative of the authors work.

      Major comments:

      A major strength of SMorph is that it leverages the Python ecosystem, which allow the authors take advantage of powerful python packages such as sklearn, without the need for external packages or tools. However, I have strong criticisms for the claims that are made in terms of speed and broad-applicability of the software, including PCA.

      Speed:

      The 1000x speed gains, assumes - for the most part -- that the processing in Fiji cannot be automated. This is false. I read the source code of SMorph, and with exception of the PCA analysis, all aspects of SMorph can be automated in Fiji, using any of Fiji's scripting languages to make direct calls to the Fiji and Sholl Analysis plugin APIs (See https://javadoc.scijava.org/) . Now, perhaps the authors do not have experience with ImageJ scripting, or perhaps we Fiji developers failed to provide clear tutorials and examples on how to do so. Or perhaps, there is something inherently cumbersome with Fiji scripting that makes this hard (e.g., there is a current limitation with the ImageJ2 version of 'Sholl Analysis' that does not make it macro recordable). It such limitations do exist, it is perfectly fine to mention them, but do contact us at https://forum.image.sc, if something is unclear. We do strive to make our work as re-usable as possible. Unfortunately our own research does not always allow us the time required to do so. Case in point, our scripting examples (e.g., https://github.com/tferr/ASA/blob/master/scripting-examples/3D_Analysis_ImageStack.py; https://github.com/tferr/ASA/blob/master/scripting-examples/3D_Analysis_ImageStack.py) are not well advertised. That being said, I am still surprised that in their side-by-side comparisons the authors were not able to automate more the processing steps (e.g., the ImageJ1 version of 'Sholl Analysis' remains fully functional and is macro recordable). If I misunderstood what was done, please provide the ImageJ macros you used. Also, I wanted to mention that i) semi-manual tracing with Simple Neurite Tracer (now "SNT"), can also be scripted (see https://doi.org/10.1101/2020.07.13.179325); and that ii) Fiji commands and plugins can also be called in native python using pyimagej (https://pypi.org/project/pyimagej/), see e.g., https://github.com/morphonets/SNT/tree/master/notebooks#snt-notebooks). Arguably, the fact that SMorph handles blob detection and skeletonization-based metrics directly is more advantageous from a user point of view. In Fiji, blob detection, skeletonization and Strahler analysis (https://imagej.net/Strahler_Analysis) of the skeleton are handled by different plugins. However, those are also fully scriptable, and interoperate well. The point that topographic skeletonization in Fiji can originate loops is valid, however the authors should know that such cycles can be detected and pruned programmatically using e.g., pixel intensities (see https://imagej.net/AnalyzeSkeleton.html#Loop_detection_and_pruning and the original publication (https://pubmed.ncbi.nlm.nih.gov/20232465/)

      Broad applicability:

      In our work, we made a significant effort to ensure that automated Sholl could be performed on any cell type: e.g., By supporting 2D and 3D images, by allowing repeated measures at each sampled distance, and by improving curve fitting. For linear profiles, we implemented the ability to perform polynomial fits of arbitrary degree, and implemented heuristics for 'best degree' determination. For normalized profiles, we implemented several normalizers, and alternatives for determining regression coefficients. We did not tackle segmentation of images directly (we did provide some accompanying scripts to aid users, see e.g. https://imagej.net/BAR) because in our case that is handled directly by ImageJ and Fiji's large collection of plugins. However, in SMorph, several of these parameters are hard-wired in the code. They may be suitable to the analyzed images, but they can be hardly generalized to other datasets. In detail: In terms of segmentation, SMorph is restricted to 2D images, scales data to a fixed 98 percentile, and uses a fixed auto-threshold method (Otsu). These settings are tethered to the authors imagery. They will give ill results for someone else using a different imaging setup, or staining method. In terms of curve fitting, the polynomial regression seems to be fixed at a 3rd order polynomial, which will not be suitable to different cell types (not even to all cells of 'radial morphology').

      PCA:

      The idea of making PCA analysis of Sholl-based morphometry accessible to a broader user base has merit and is welcomed. However, it has to be done carefully in a self-critic manner as opposed to a black-box solution. E.g., in the text it is mentioned that 2 principal components are used, in the tutorial notebook, 3. Why not provide intuitive scree plots that empower users with the ability to criticize choice? Also, it would be useful for users to understand which metrics correlate with each other, and their variable weights.

      When we showcased the utility of PCA to distinguish closely related morphology groups (as in Type-1 and Type-2 PV neurons), we had been unable to base the distinction on individual metrics, at least not in a robust manner (see Fig. S4 in Ferreira et al, 2014). A minor conundrum of the paper, is that it does not directly highlight the advantages of "analyzes in a multidimensional space". The differences between groups in the stab wound and DMI assays are such, that PCA is hardly needed: I.e., the differences depicted Fig2F,G are already significant, and already convey changes in "size and branch complexity" (as per PC1). The same argument applies to Fig. 5. The paper would profit from having this discussed.

      Other:

      • All metrics and parameters should be expressed in physical units (e.g.," radii increasing by 3 pixels", axes in Figure 2, 3, 5, S2) so that readers can directly interpret them.
      • The paper would profit from the insights provided by Bird & Cuntz (https://pubmed.ncbi.nlm.nih.gov/31167149/)

      Minor comments:

      • Usage of RGB images (8-bit per channel) seems hardly justifiable. Aren't you loosing dynamic range of GFAP signal?
      • Please explain how MaxAbsScaler "prevents sub-optimal results"
      • The fact that automated batch processing can stall on a single bad 'contrast ratio' image seems rather cumbersome to deal with
      • Please add a license to https://github.com/parulsethi/SMorph/. Without it, other projects may shy away from using SMorph
      • "mounted on stereotax" should be "mounted on a stereotaxis device"?
      • Ensure Schoenen is capitalized

      Significance:

      I find the Desipramine results interesting. However, given the existing claims that DMI can modulate LTP, I regret that the authors did not look at structural modifications in hippocampal neurons (e.g., by performing the experiments in Thy1-M-eGFP animals). I understand, that doing so at this point would be a large undertaking.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Overall, we were pleased that the reviewers found our study carefully designed and interesting. We have addressed their comments below.

      Reviewer #1 (Evidence, reproducibility and clarity)

      The manuscript by Kern, et al., demonstrates that phagocytosis in macrophages is regulated in part by the intermolecular distance of phagocytosis-promoting receptors engaging phagocytic targets. Cells expressing chimeric receptors containing cytosolic domains of Fc receptors (FcR) and defined ligand-binding DNA domains were used to drive phagocytosis of opsonized glass beads coated with complementary DNA ligands of defined spacing and number. These so-called origami ligands allowed manipulation of receptor spacing following engagement, which allowed the demonstration that tight spacing of ligands (7 nm or 3.5 nm) optimized signaling for phagocytosis. The study is carefully performed and convincing. I have a few technical concerns and minor suggestions.

      1. It is assumed that the origami preparations were entirely uniform. How much variation was there? Is that supported by TIRF microscopy of origami preparations? Was the TIRF microscopy calibrated for uniformity of fluorescence (ie., shade correction)?

      Our laboratory, Dong et al., has extensively characterized the origami uniformity and robustness of these exact pegboards. This paper was just posted on bioRxiv (Dong et. al, 2021). We have also cited this paper in our revised manuscript in reference to the characterization of the DNA origami (Line 117).

      We did not use any shade correction. Instead we only collected data from a central ROI in our TIRF field. To check for uniformity of illumination, we plotted the origami pegboard fluorescent intensity along the x and y axis. We observed very modest drop off in signal - the average signal intensity of origamis within 100 pixels of the edge is 76 ± 6% the intensity of origamis in a 100 pixel square in the center of the ROI. Fitting this data with a Gaussian model resulted in very poor R values. While this may account for some of the variation in signal intensity at individual points, we expect the normalized averages of each condition to be unaffected. We have amended the methods to describe this strategy (Lines 851-854).

      [[images cannot be shown]]

      2. Likewise, how much variation was there in the expression of the chimeric receptors? Large variation in receptor numbers per cell could significantly alter the quantitative studies. Aside from the flow sorting for cells expressing two different molecules, how were cells selected for analysis?

      We thank the reviewer for bringing up this point. We confirmed comparable receptor expression levels at the cell cortex of the DNA CAR-𝛾 and the DNA CAR-adhesion used throughout the paper. We also have confirmed that receptor levels at the cell cortex were similar for the large DNA CAR constructs used in Figure 6C-D. This data is now included in Figures S5 and S7. We have also altered the text to include this (lines 169-172):

      Expression of the various DNA CARs at the cell cortex was comparable, and engulfment of beads functionalized with both the 4T and the 4S origami platforms was dependent on the Fc__𝛾R signaling domain (Figure S5).

      When quantifying bead engulfment, cells were selected for analysis based on a threshold of GFP fluorescence, which was held constant throughout analysis for each individual experiment. We have amended the “Quantification of engulfment” methods section to convey this (lines 921-923).

      3. The scale of the origami relative to the cells is difficult to discern in Figures 2C and D. Additional text would be helpful to indicate, for example, that the spots on the Fig. 2D inset indicate entire origami rather than ligand spots on individual origami particles.

      Thank you for pointing this out, we see how the legend was unclear and have corrected it (lines 453-454), including specifically noting “Each diffraction limited magenta spot represents an origami pegboard.” We have also outlined the cell boundary in yellow to make the cell size more clear.

      4. Figure 5 legend, line 482: How was macrophage membrane visualized for these measurements?

      We have added the following clarification (line 535-536): “The macrophage membrane was visualized using the DNA CAR__𝛾, which was present throughout the cell cortex.”

      5. line 265: "our data suggest that there may be a local density-dependent trigger for receptor phosphorylation and downstream signaling". This threshold-dependent trigger response was also indicated in the study of Zhang, et al. 2010. PNAS.

      The Zhang et al. study was influential in our study design, and we wish to give the appropriate credit. Zhang et al. found that a sufficient amount of IgG is necessary to activate late (but not early) steps in the phagocytic signaling pathway. In contrast, our study addresses IgG concentration in small nanoclusters. We find that this nanoscale density affects receptor phosphorylation. Thus, we think these two studies are distinct and complementary.

      Lines 283-287 now read:

      While this model has largely fallen out of favor, more recent studies have found that a critical IgG threshold is needed to activate the final stages of phagocytosis (Zhang et al., 2010). Our data suggest that there may also be a nanoscale density-dependent trigger for receptor phosphorylation and downstream signaling.

      6. line 55: Rephrase, “we found that a minimum threshold of 8 ligands per cluster maximized FcgR-driven engulfment.” It is difficult to picture how a minimum threshold maximizes something.

      We now state “we found that 8 or more ligands per cluster maximized FcgR-driven engulfment.”

      7. line 184: Rephrase, "we created... pegboards with very high-affinity DNA ligands that are predicted not to dissociate on a time scale of >7 hr". Remove "not".

      Thank you for pointing this out, it is now correct.

      Reviewer #1 (Significance):

      This study provides a significant advance in understanding about the molecular mechanisms of signaling for particle ingestion by phagocytosis.


      Reviewer #2 (Evidence, reproducibility and clarity):

      The manuscript on “Tight nanoscale clustering of Fcg-receptors using DNA origami promotes phagocytosis" studies how clustering and nanoscale spacing of ligand molecules for a chimeric Fcg-receptors influence the phagocytosis of functionalized silicon beads by macrophage cell lines. The basis of this study is the design of a chimeric Fc-receptor (DNA-CARg) comprising an extracellular SNAP-tag domain that can be loaded with single-stranded (ss) DNA, the transmembrane part of CD86 and the cytosolic part of the Fc-receptor g-chain containing an immunoreceptor tyrosine-based activation motif (ITAM) as well as a C-terminal green fluorescent protein (GFP). As control the authors used a similar designed DNA-CAR that is lacking the intracellular ITAM-containing FCg tail. The chosen target for this chimeric DNA-CAR, are silicon beads covered by a lipid bilayer that contains biotin-labelled lipids that, via Neutravidin, can be loaded with a biotinylated DNA origami pegboard displaying complimentary ss-DNA as ligand for the DNA-CAR. The DNA origami pegboard contains four ATTO647N fluorescence for visualization and the ssDNA ligand in different quantities and spacing.

      Using these principles, the authors study how ligand affinity, concentration and spacing influence the activation of the DNA-CARg and the engulfment of the loaded beads.

      The authors show that bead engulfment is increased between 2 till 8 ssDNA ligands on the pegboard. After this, ligand numbers do not play a role anymore in the engulfment. They then study the role of the ligand spacing using pegboards that either contain 4 single strand DNA ligands in close (7nm/3,5nm) proximity or a more spaced version using 21/17,5 nm or 35/38,5 nm. The authors find that the bead engulfment is maximally and positively affected by the close spacing of the ssDNA ligands. In their final experiments the authors vary the design of the DNA-CARs by tetramerization of the ITAM-containing Fcg-signaling subunit. In their discussion the authors mention different possibilities for the effect of spacing on the engulfment process.

      I think that, in general, this is an interesting study. However, it has some caveats and open issues that should be clarified before its publication.

      Major comments

      1. As a general comment, it is somewhat a pity that the authors did not use the endogenous FcR as a control. It would have been quite easy for the authors to place the SNAP-tag domain on the Fcg extracellular domain which would allow to do all their experiments in parallel, not only with the DNA-CAR, but also with a DNA-containing wild type receptor. Such a control would be important because, by using a CD86 transmembrane domain, the authors do not know whether the nanoscale localization of their chimeric receptors is reflecting that of the endogenous Fcg receptor.**

      We agree with the reviewer completely. We have repeated experiments shown in Figure 4A with a DNA-CAR containing the Fc𝛾 transmembrane domain instead of CD86 as the reviewer suggests. We also included a DNA-CAR version of the Fc𝛾R1 alpha chain, although this construct was not expressed as well as the others. These data are now included in Figure S5, and referenced in lines 167-168.

      2. An important issue that is discussed by the authors but not addressed in this manuscript is whether the different amount and spacing of the ligand is only impacting on signaling or also on the mechanical stress of the cells. Indeed, mechanical stress on the cytoskeleton arrangement could influence the engulfment process. For this, it would be very important to test that the different bead engulfment, for example, those shown in Fig. 4, is strictly dependent on signaling kinases. The authors should repeat the experiment of Fig. 4 a and b in the presence or absence of kinase inhibitors such as the Syk inhibitor R406 or the Src inhibitor PP2 to show whether the different phase of engulfment is dependent on the signaling function of these kinases. This crucial experiment is clearly missing from their study.

      We agree this is an interesting point. We find that ligand spacing affects receptor phosphorylation; however this does not preclude effects on downstream aspects of the signaling pathway. We will clarify this by adding the following comment to the manuscript (line 299-301):

      While our data pinpoints a role for ligand spacing in regulating receptor phosphorylation, it is possible that later steps in the phagocytic signaling pathway are also directly affected by ligand spacing.

      The DNA-CAR-adhesion in Figure 1 strongly suggests that intracellular signaling is essential for phagocytosis. We have now included additional controls using this construct as detailed in our response to point 3 below. Unfortunately, Src and Syk inhibitors or knockout abrogate Fc𝛾R mediated phagocytosis (for example, PMIDs 11698501, 9632805, 12176909, 15136586) and thus would eliminate phagocytosis in both the 4T and 4S conditions. This precludes analysis of downstream steps in the phagocytic signaling pathway.

      3. Another problem of this study is that the authors show in Fig. 1A the control DNA-CAR-adhesion but then hardly use it in their study. For example, the crucial experiments shown in Fig. 4 should be conducted in parallel with DNA-CAR-adhesion expressing macrophage cells. This study could provide another indication whether or not ITAM signaling is important for the engulfment process.

      We have added this control. It is now included in Figure S5 and S7. Figure 3D also shows that the DNA-CAR-adhesion combined with the 4T origami pegboards does not activate phagocytosis and we have amended the text to make this more clear (line 152).

      4. Another important aspect is how the concentration of the loaded origami pegboard is influencing the engulfment process. In particular, it would be interesting to show the padlocks with different spacings such as the 4T closed spacing versus 4s large spacing show a different dependency on the concentration of this padlock loading on the beads. This would be another important experiment to add to their study.**

      We agree that this is an interesting question. We suspect that at a very high origami density, 4S signaling would improve, and potentially approach the 4T. However, we are currently coating the beads in saturating levels of origami pegboards. Thus we cannot increase origami pegboard density and address this directly.

      Minor comments:

      1. The definition of the ITAM is Immunoreceptor Tyrosine-based Activation Motif and not "Immune Tyrosine Activation Motif" as stated by the authors.

      We have corrected this.

      2. The authors discuss that it is the segregation of the inhibitory phosphatase CD45 from the clustered Fc receptors is the major mechanism explaining their finding that 4T closed spacing is more effective than 4s large spacing. With the event of the CRISPR/Cas9 technology it is trivial to delete the CD45 gene in the genome of the RAW264.7 macrophage cell line used in this study and I am puzzled why they author are not conducting such a simple but for their study very important experiment (it takes only 1-2 month to get the results).

      This experiment may be informative but we have two concerns about its feasibility. First, CD45 is a phosphatase with many different roles in macrophage biology, including activating Src family kinases by dephosphorylating inhibitory phosphorylation sites (PMID 8175795, 18249142, 12414720). Second, CD45 is not the only bulky phosphatase segregated from receptor nanoclusters. For example, CD148 is also excluded from the phagocytic synapse (PMID 21525931). CD45 and CD148 double knockout macrophages show hyperphosphorylation of the inhibitory tyrosine on Src family kinases, severe inhibition of phagocytosis, and an overall decrease in tyrosine phosphorylation (PMID 18249142). CD45 knockout alone showed mild phenotypes in macrophages. We anticipate that knocking out CD45 alone would have little effect, and knocking out both of these phosphatases would preclude analysis of phagocytosis. Because of our feasibility concerns and the lengthy timeline for this experiment, we believe this is outside of the scope of our study.

      In our discussion, we simplistically described our possible models in terms of CD45 exclusion, as the mechanisms of CD45 exclusion have been well characterized. This was an error and we have amended our discussion to read (lines 335-343):

      As an alternative model, a denser cluster of ligated receptors may enhance the steric exclusion of the bulky transmembrane proteins like the phosphatases CD45 and CD148 (Bakalar et al., 2018; Goodridge et al., 2012; Zhu, Brdicka, Katsumoto, Lin, & Weiss, 2008).

      Reviewer #2 (Significance):

      The innovative part of this study is the combination of SNAP-tag attached, chimeric Fc-receptor with the DNA origami pegboard technology to address important open question on receptor function.

      Referees cross-commenting

      I find most of my three reviewing colleagues reasonable I also agrée to Reviewer #1 comments 2

      Likewise, how much variation was there in the expression of the chimeric receptors?

      Large variation in receptor numbers per cell could significantly alter the quantitative studies. Aside from the flow sorting for cells expressing two different molecules, how were cells selected for analysis?

      But I want to add it is not only the amount of receptors but ils the nanoscale location that is key to receptor function

      We have ensured that all receptors are trafficked to the cell surface. We have also measured their intensity at the cell cortex as discussed in response to Reviewer 1.

      Reviewer #3 (Evidence, reproducibility and clarity):

      This is a very nicely done synthetic biology/biophysics study on the effect of ligands spacing on phagocytosis. They use a DNA based recognition system that the group has previously use to investigate T cell signaling, but express the SNAP tag linked transmembrane receptor in a macrophage cell line and present the ligands using DNA origami mats to control the number and spacing of complementary ligands that are designed to be in the typical range for low or high affinity FcR, a receptor that can trigger phagocytosis. The study offers some very nice quantitative data sets that will be of immediate interest to groups working in this area and, in the future, for design of synthetic receptors for immunotherapy applications. Other groups are working on similar platform for TCR. I don't feel there is any need for more experiments, but I have some questions and suggestions. Answering and considering these could clarify the new biological knowledge gained.

      We thank the reviewer for their support of our manuscript. Given the reviewer’s statement that no new experiments are required, we have answered their questions to the best of our ability given the current data. Should the editor decide that any of these topics require experimental data to enhance the significance of the paper, we are happy to discuss new experiments.

      Reviewer #3 (Significance):

      I think the significance would be increased by addressing these questions, that would help understand how the synthesis system described related to other system directed as similar questions and more natural settings.

      1.The densities of the freely mobile DNA ligands required to trigger phagocytosis is quite high. Was the length of the DNA duplexes optimized? The entire complex for both the intermediate and high affinity duplexes seems quite short, perhaps <10 nm. Might the stimulation be more efficient if a short stretch of DS DNA is added to increase the length to 12-13 nm?

      The extracellular domain of the DNA-CAR (SNAP tag and ssDNA strand) are approximately 10 nm (PMID 28340336). The biotinylated ligand ssDNA is attached to the bilayer via neutravidin, resulting in a predicted 14 nm intermembrane spacing. The endogenous IgG FcR complex is 11.5 nm. Bakalar et al (PMID 29958103) tested the effect of antigen height on phagocytosis and found that the shortest intermembrane distance tested (approximately 15 nm) was the most effective. As the reviewer notes, the optimal distance between macrophage and target may be larger than our DNA-CAR. However we think the intermembrane spacing in our system is within the biologically relevant range.

      We saw robust phagocytosis at 300 molecules/micron of ssDNA, which is similar to the IgG density used on supported lipid bilayer-coated beads in other phagocytosis studies (PMID 29958103, 32768386). As the reviewer noticed, this is significantly higher than ligand density necessary to activate T cells (PMID 28340336). We have added a comment on ligand density to lines 96-97.

      2. Are the origami mats generally laterally mobile on the bilayers. If so, what is the diffusion coefficient? Can one detect the mats accumulating in the initial interface between the bead and cell, particularly in cased where there is no phagocytosis? Would immobility of the mats make them more efficient at mediating phagocytosis compared to the monodispersed ligands, which I assume are highly mobile and might even be "slippery".

      We have confirmed that our bead protocol generally produces mobile bilayers, where his-tagged proteins can freely diffuse to the cell-bead interface (see accumulation of a his-tagged FRB binding to a transmembrane FKBP receptor at the cell-bead synapse below). We can qualitatively say that the origamis appear mobile on a planar lipid bilayer (see Dong et. al 2021 and images below). Directly measuring the diffusion coefficient on the beads is extremely difficult because the beads themselves are mobile (both diffusing and rotating), and cannot be imaged via TIRF. We do not see much accumulation of the origami at cell-bead synapses. This could reflect lower mobility of the origamis, or could be because the relative enrichment of origamis is difficult to detect over the signal from unligated origamis.

      Overall, we expect the origami pegboards (tethered by 12 neutravidins) are less mobile than single strand DNA (tethered by a single neutravidin, supported by qualitative images below). We are uncertain whether this promotes phagocytosis. At least one study suggests that increased IgG mobility promotes phagocytosis (PMID 25771017). However, the zipper model would suggest that tethered ligands may provide a better foothold for the macrophage as it zippers the phagosome closed (PMID 14732161). Hypothetically, ligand mobility could affect signaling in two ways - first by promoting nanocluster formation, and second by serving as a stable platform for signaling as the phagosome closes. Since our system has pre-formed nanoclusters, the effect of ligand mobility may be quite different than in the endogenous setting.

      [[image cannot be shown]]

      In the above images, a 10xHis-FRB labeled with AlexaFluor647 was conjugated to Ni-chelating lipids in the bead supported lipid bilayer. The macrophages express a synthetic receptor containing an extracellular FKBP and an intracellular GFP. Upon addition of rapamycin, FRB and FKBP form a high affinity dimer, and FRB accumulates at the bead-macrophage contact sites.

      [[image cannot be shown]]

      In the above images, single molecules were imaged for 3 sec. The tracks of each molecule are depicted by lines, colored to distinguish between individual molecules. The scale bar represents 5 microns in both panels.

      3. Breaking down the analysis into initiation and completion is interesting. When using the non-signalling adhesion constructs, would they get to the initiation stage or would that attachment be less extensive than the initiation phase.

      This is an interesting question. While we did not include the DNA-CAR-adhesion in our kinetic experiments, we have now quantified the frequency of cups that would match our ‘initiation’ criteria in 3 representative data sets where macrophages were fixed after 45 minutes of interaction with origami pegboard-coated beads. We found that an average of 16/125 of 4T beads touching DNA-CAR-adhesion macrophages met the ‘initiation’ criteria and an average of 2/125 were eaten (14% total). In comparison, we examined 4T beads touching DNA CAR𝛾 macrophages and found that on average 23/125 met the ‘initiation’ criteria, and 45/125 were already engulfed (54%). This suggests that the DNA-CAR-adhesion alone may induce enough interaction to meet our initiation criteria, but without active signaling from the FcR this extensive interaction is rare. We have added this data in a new Figure S6 and commented on this in lines 213-215.

      4. It would be interesting to put these results in perspective of earier work on spacing with planar nanoarrays, although these can't be applied to beads. For integrin mediated adhesion there was a very distinct threshold for RGD ligand spacing that could be related to the size of some integrin-cytoskeletal linkers (PMID: 15067875). On the other hand, T cell activation seemed more continuous with changes in spacing over a wide range with no discrete threshold (PMID: 24117051, 24125583) unless the spacing was increased to allow access to CD45, in which case a more discrete threshold was generated (PMID: 29713075). The results here for phagocytosis with the very small ligands that would likely exclude CD45 seems to be more of a continuum without a discrete threshold, although high densities of ligand are needed. This issue of continuous sensing vs sharp threshold is biologically interesting so would be good assess this by as consistent standards are possible across systems.**

      We agree that this is an interesting body of literature worth adding to our discussion. We have added a paragraph that puts our study in the context of prior work on related systems, including these nanolithography studies (Line 364-382):

      How does the spacing requirements for Fc__𝛾R nanoclusters compare to other signaling systems? Engineered multivalent Fc oligomers revealed that IgE ligand geometry alters Fcε receptor signaling in mast cells (Sil, Lee, Luo, Holowka, & Baird, 2007). DNA origami nanoparticles and planar nanolithography arrays have previously examined optimal inter-ligand distance for the T cell receptor, B cell receptor, NK cell receptor CD16, death receptor Fas, and integrins (Arnold et al., 2004; Berger et al., 2020; Cai et al., 2018; Deeg et al., 2013; Delcassian et al., 2013; Dong et al., 2021; Veneziano et al., 2020). Some systems, like integrin-mediated cell adhesion, appear to have very discrete threshold requirements for ligand spacing while others, like T cell activation, appear to continuously improve with reduced intermolecular spacing (Arnold et al., 2004; Cai et al., 2018). Our system may be more similar to the continuous improvement observed in T cell activation, as our most spaced ligands (36.5 nm) are capable of activating some phagocytosis, albeit not as potently as the 4T. Interestingly, as the intermembrane distance between T cell and target increases, the requirement for tight ligand spacing becomes more stringent (Cai et al., 2018). This suggests that IgG bound to tall antigens may be more dependent on tight nanocluster spacing than short antigens. Planar arrays have also been used to vary inter-cluster spacing, in addition to inter-ligand spacing (Cai et al., 2018; Freeman et al., 2016). Examining the optimal inter-cluster spacing during phagosome closure may be an interesting direction for future studies.


      Additional experiments performed in revision

      In addition to these reviewer comments, we have added additional controls validating the DNA-CAR-4x𝛾 used in Figure 6c,d. We compared the DNA-CAR-4x𝛾 to versions of the DNA-CAR-1x𝛾-3x𝛥ITAM construct with the functional ITAM in the second and fourth positions (see the schematics now included Figure S7). We found that four individual receptors with a single ITAM each were able to induce phagocytosis regardless of which position the ITAM was in. However the DNA-CAR-4x𝛾 construct, which also contains 4 ITAMs, was not. This further validates the experiment presented in 6c,d. We also fixed minor errors we discovered in the presentation of data for Figures 1C and S1A.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reviewer #3

      Evidence, reproducibility and clarity

      This is a very nicely done synthetic biology/biophysics study on the effect of ligands spacing on phagocytosis. They use a DNA based recognition system that the group has previously use to investigate T cell signaling, but express the SNAP tag linked transmembrane receptor in a macrophage cell line and present the ligands using DNA origami mats to control the number and spacing of complementary ligands that are designed to be in the typical range for low or high affinity FcR, a receptor that can trigger phagocytosis. The study offers some very nice quantitative data sets that will be of immediate interest to groups working in this area and, in the future, for design of synthetic receptors for immunotherapy applications. Other groups are working on similar platform for TCR. I don't feel there is any need for more experiments, but I have some questions and suggestions. Answering and considering these could clarify the new biological knowledge gained.

      Significance:

      I think the significance would be increased by addressing these questions, that would help understand how the synthesis system described related to other system directed as similar questions and more natural settings.

      1. The densities of the freely mobile DNA ligands required to trigger phagocytosis is quite high. Was the length of the DNA duplexes optimized? The entire complex for both the intermediate and high affinity duplexes seems quite short, perhaps <10 nm. Might the stimulation be more efficient if a short stretch of DS DNA is added to increase the length to 12-13 nm?
      2. Are the origami mats generally laterally mobile on the bilayers. If so, what is the diffusion coefficient? Can one detect the mats accumulating in the initial interface between the bead and cell, particularly in cased where there is no phagocytosis? Would immobility of the mats make them more efficient at mediating phagocytosis compared to the monodispersed ligands, which I assume are highly mobile and might even be "slippery".
      3. Breaking down the analysis into initiation and completion is interesting. When using the non-signalling adhesion constructs, would they get to the initiation stage or would that attachment be less extensive than the initiation phase.
      4. It would be interesting to put these results in perspective of earier work on spacing with planar nanoarrays, although these can't be applied to beads. For integrin mediated adhesion there was a very distinct threshold for RGD ligand spacing that could be related to the size of some integrin-cytoskeletal linkers (PMID: 15067875). On the other hand, T cell activation seemed more continuous with changes in spacing over a wide range with no discrete threshold (PMID: 24117051, 24125583) unless the spacing was increased to allow access to CD45, in which case a more discrete threshold was generated (PMID: 29713075). The results here for phagocytosis with the very small ligands that would likely exclude CD45 seems to be more of a continuum without a discrete threshold, although high densities of ligand are needed. This issue of continuous sensing vs sharp threshold is biologically interesting so would be good assess this by as consistent standards are possible across systems.
    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      The manuscript on "Tight nanoscale clustering of Fcg-receptors using DNA origami promotes phagocytosis" studies how clustering and nanoscale spacing of ligand molecules for a chimeric Fcg-receptors influence the phagocytosis of functionalized silicon beads by macrophage cell lines. The basis of this study is the design of a chimeric Fc-receptor (DNA-CARg) comprising an extracellular SNAP-tag domain that can be loaded with single-stranded (ss) DNA, the transmembrane part of CD86 and the cytosolic part of the Fc-receptor g-chain containing an immunoreceptor tyrosine-based activation motif (ITAM) as well as a C-terminal green fluorescent protein (GFP). As control the authors used a similar designed DNA-CAR that is lacking the intracellular ITAM-containing FCg tail. The chosen target for this chimeric DNA-CAR, are silicon beads covered by a lipid bilayer that contains biotin-labelled lipids that, via Neutravidin, can be loaded with a biotinylated DNA origami pegboard displaying complimentary ss-DNA as ligand for the DNA-CAR. The DNA origami pegboard contains four ATTO647N fluorescence for visualization and the ssDNA ligand in different quantities and spacing. Using these principles, the authors study how ligand affinity, concentration and spacing influence the activation of the DNA-CARg and the engulfment of the loaded beads.

      The authors show that bead engulfment is increased between 2 till 8 ssDNA ligands on the pegboard. After this, ligand numbers do not play a role anymore in the engulfment. They then study the role of the ligand spacing using pegboards that either contain 4 single strand DNA ligands in close (7nm/3,5nm) proximity or a more spaced version using 21/17,5 nm or 35/38,5 nm. The authors find that the bead engulfment is maximally and positively affected by the close spacing of the ssDNA ligands. In their final experiments the authors vary the design of the DNA-CARs by tetramerization of the ITAM-containing Fcg-signaling subunit. In their discussion the authors mention different possibilities for the effect of spacing on the engulfment process.

      I think that, in general, this is an interesting study. However, it has some caveats and open issues that should be clarified before its publication.

      Major comments

      1. As a general comment, it is somewhat a pity that the authors did not use the endogenous FcR as a control. It would have been quite easy for the authors to place the SNAP-tag domain on the Fcg extracellular domain which would allow to do all their experiments in parallel, not only with the DNA-CAR, but also with a DNA-containing wild type receptor. Such a control would be important because, by using a CD86 transmembrane domain, the authors do not know whether the nanoscale localization of their chimeric receptors is reflecting that of the endogenous Fcg receptor.
      2. An important issue that is discussed by the authors but not addressed in this manuscript is whether the different amount and spacing of the ligand is only impacting on signaling or also on the mechanical stress of the cells. Indeed, mechanical stress on the cytoskeleton arrangement could influence the engulfment process. For this, it would be very important to test that the different bead engulfment, for example, those shown in Fig. 4, is strictly dependent on signaling kinases. The authors should repeat the experiment of Fig. 4 a and b in the presence or absence of kinase inhibitors such as the Syk inhibitor R406 or the Src inhibitor PP2 to show whether the different phase of engulfment is dependent on the signaling function of these kinases. This crucial experiment is clearly missing from their study.
      3. Another problem of this study is that the authors show in Fig. 1A the control DNA-CAR-adhesion but then hardly use it in their study. For example, the crucial experiments shown in Fig. 4 should be conducted in parallel with DNA-CAR-adhesion expressing macrophage cells. This study could provide another indication whether or not ITAM signaling is important for the engulfment process.
      4. Another important aspect is how the concentration of the loaded origami pegboard is influencing the engulfment process. In particular, it would be interesting to show the padlocks with different spacings such as the 4T closed spacing versus 4s large spacing show a different dependency on the concentration of this padlock loading on the beads. This would be another important experiment to add to their study.

      Minor comments:

      1. The definition of the ITAM is Immunoreceptor Tyrosine-based Activation Motif and not "Immune Tyrosine Activation Motif" as stated by the authors.
      2. The authors discuss that it is the segregation of the inhibitory phosphatase CD45 from the clustered Fc receptors is the major mechanism explaining their finding that 4T closed spacing is more effective than 4s large spacing. With the event of the CRISPR/Cas9 technology it is trivial to delete the CD45 gene in the genome of the RAW264.7 macrophage cell line used in this study and I am puzzled why they author are not conducting such a simple but for their study very important experiment (it takes only 1-2 month to get the results).

      Referees cross-commenting

      I find most of my three reviewing colleagues reasonable

      I also agree to Reviewer #1 comments 2

      Likewise, how much variation was there in the expression of the chimeric receptors? Large variation in receptor numbers per cell could significantly alter the quantitative studies. Aside from the flow sorting for cells expressing two different molecules, how were cells selected for analysis?

      But I want to add it is not only the amount of receptors but ils the nanoscale location that is key to receptor function.

      Significance:

      The innovative part of this study is the combination of SNAP-tag attached, chimeric Fc-receptor with the DNA origami pegboard technology to address important open question on receptor function.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      The manuscript by Kern, et al., demonstrates that phagocytosis in macrophages is regulated in part by the intermolecular distance of phagocytosis-promoting receptors engaging phagocytic targets. Cells expressing chimeric receptors containing cytosolic domains of Fc receptors (FcR) and defined ligand-binding DNA domains were used to drive phagocytosis of opsonized glass beads coated with complementary DNA ligands of defined spacing and number. These so-called origami ligands allowed manipulation of receptor spacing following engagement, which allowed the demonstration that tight spacing of ligands (7 nm or 3.5 nm) optimized signaling for phagocytosis. The study is carefully performed and convincing. I have a few technical concerns and minor suggestions.

      1. It is assumed that the origami preparations were entirely uniform. How much variation was there? Is that supported by TIRF microscopy of origami preparations? Was the TIRF microscopy calibrated for uniformity of fluorescence (ie., shade correction)?
      2. Likewise, how much variation was there in the expression of the chimeric receptors? Large variation in receptor numbers per cell could significantly alter the quantitative studies. Aside from the flow sorting for cells expressing two different molecules, how were cells selected for analysis?
      3. The scale of the origami relative to the cells is difficult to discern in Figures 2C and D. Additional text would be helpful to indicate, for example, that the spots on the Fig. 2D inset indicate entire origami rather than ligand spots on individual origami particles.
      4. Figure 5 legend, line 482: How was macrophage membrane visualized for these measurements?
      5. line 265: "our data suggest that there may be a local density-dependent trigger for receptor phosphorylation and downstream signaling". This threshold-dependent trigger response was also indicated in the study of Zhang, et al. 2010. PNAS.
      6. line 56: Rephrase, "we found that a minimum threshold of 8 ligands per cluster maximized FcgR-driven engulfment." It is difficult to picture how a minimum threshold maximizes something.
      7. line 171: Rephrase, "we created... pegboards with very high-affinity DNA ligands that are predicted not to dissociate on a time scale of >7 hr". Remove "not".

      Significance

      This study provides a significant advance in understanding about the molecular mechanisms of signaling for particle ingestion by phagocytosis.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Response to all reviewers

      We thank all the reviewers for carefully considering our manuscript and providing useful comments and suggestions. We agree with the general comment that testing our key findings in breast cancer cells is important. We will therefore carry out this work over the coming months and include this data in the revision. The other specific comments we address individually in the point-by-point responses below, which provides an outline of the other new experiments we plan to carry out prior to revision.

      In addition to this, we would like to just highlight one general point that we only picked up when considering these responses. It is important to highlight this to all reviewers now, since we believe it adds clinical weight to our conclusions. This relates to the issue of P53, which our manuscript shows drives resistance to CDK4/6 inhibition in cells by inhibiting long-term cell cycle withdrawal following genotoxic damage.

      P53 loss has been implicated in abemaciclib resistance in breast cancer patients (P53 mutation was detected in 2/18 responsive patients and 10/13 non-responsive patents (Patnaik et al., 2016)). This was recently corroborated in a larger scale study in breast cancer: the first whole exome sequencing study aimed at characterising intrinsic and acquired resistance to CDK4/6 inhibitors (Wander et al., 2020). In this recent study, P53 loss/mutation was identified in 0/18 sensitive tumours, 14/28 intrinsically resistant tumours, and 9/13 tumour with acquired resistance**. This was the most frequent single genetic change associated with resistance (58.5%), although 8 other genetic changes were also associated with resistance to differing degrees (7-27%).

      Most of these other resistance events occurred in pathways known previously to help drive G1/S progression following CDK4/6 inhibition: i.e. fully predictable resistance mechanism (RB loss, CCNE2 amplification, ER loss, RAS/AKT1 activation, FGFR2/ERB22 mutation/amplification). Importantly, when the authors attempted to recapitulate these resistance event in breast cancer cell lines, they could demonstrate the expected increase in proliferation following CDK4/6 inhibition in all situation tested, except for P53 loss. This caused the authors to conclude that “loss of P53 function is not sufficient to drive CDK4/6i resistance”. This would appear to us to be an unsatisfactory explanation given the clinical data. However, the authors speculated further that: “Enrichment of TP53 mutation in resistant specimens may result from heavier pre-treatment (including chemotherapies), may be permissive for the development of other resistance-promoting alterations, or may cooperate with secondary alterations to drive CDK4/6i resistance in vivo.”

      We believe that our data provide a crucial alternative explanation for these clinical findings. P53 does not affect the efficiency of a G1 arrest (fig.2), but rather it prevents the resulting genotoxic damage from inducing long-term cell cycle withdrawal (figs.2,3). Therefore, this could explain why it drives resistance in clinical disease but not in the in vitro cell growth assays employed by Wander et al. This highlights a crucial general point of our paper – important effects like this can be missed or misinterpreted until the true nature of long-term cell cycle withdrawal is appreciated.

      As part of our breast cancer work at revision we will analyse this closely by comparing the effect of p53 loss on long-term cell cycle withdrawal. If the current RPE1 data holds true in breast cancer, then we believe that out study would provide a crucial explanation for these clinical findings, and in turn, these clinical data would throw weight behind our conclusion that genotoxic damage and p53 loss is a clinically important consequence of CDK4/6 inhibition in patients.


      Reviewer #1 (Evidence, reproducibility and clarity (Required)): Comments on 'CDK4/6 inhibitors induce replication stress to cause long-term cell cycle withdrawal' The rationale for this work is to understand the mechanism by which Cdk4/6 inhibitors inhibit tumour cell growth, specifically via senescence which seems to be a frequent outcome of Cdk4/6 inhibition. Although several mechanisms by which Cdk4/6 inhibition induce senescence have been proposed these have varied with the cancer cell model studied. To examine the mechanism for the cytostatic effect of cdk4/6i in therapy without potential confounding effects of different cancer cell line backgrounds, Crozier et al tackle this question in the non-transformed, immortalised diploid human cell line, RPE1. They use live cell imaging and colony formation to track the impact of G1 arrests of different lengths induced by a range of clinically relevant cdk4/6 inhibitors. They also use CRISPR-mediated removal of p53 to examine the role of p53 in the observed cell cycle responses. After noting that G1 arrest of over 2 days leads to a pronounced failure in continued cell cycle and proliferation that is associated with features of replication stress, they perform a proteomics analysis to determine the factors responsible for this. They discover that MCM complex components and some other replicative proteins are downregulated and overall suggest a mechanism whereby downregulation of these essential replication components during a prolonged G1 induce replication stress and ultimate failure of proliferation. They show the impact of cdk4/6 inhibition can be increased by combining with either aneuploidy induction (to indirectly elevate replication stress), aphidicolin (to directly elevate replication stress) or chemotherapy agents that damage DNA. Overall this is a well written and presented manuscript. Data are extremely clearly presented and described clearly within the text. Most appropriate controls were included and the work is performed to a high standard. I have a few comments about the proteomic analysis, and the link between MCM component deregulation and the induction of replication stress:

      - We thank the reviewer for this careful, detailed review, and for their kind comments about our work.

      **Major points:**

      1. Relevance to cancer. I appreciate that examining the mechanism in a diploid line is a sensible place to start. However it remains a bit unclear precisely which aspects of this mechanism might be conserved in cancer. It could be helpful to provide evidence (if it exists) of the impact of cdk4/6 inhibition in tumour cells. For example, are catastrophic mitosis, senescence, etc observed? And is there anything further known about the relationship between tumour mutations such as p53 and clinical response to Cdk4/6i?

      - It is important to point out that senescence is a common outcome of CDK4/6 inhibition in tumour cells, but exactly why tumour cells become senescent is still unclear. There have been many possible explanations proposed (see introduction), but so far, none of these implicate DNA damage. This is surprising for us, considering that DNA damage remains the best-known inducer of senescence and this is how most other broad-spectrum anti-cancer drugs induce permanent cell cycle exit. P53 loss has been associated with CDK4/6i resistance in the clinic, but this has also not previously been linked to genotoxic stress or senescence following CDK4/6 inhibition (see detailed description of this in comment to all reviewers above).** Therefore, our data could help to explain both of these key findings. However, we appreciate the importance of testing these results in breast cancer cells, therefore we will perform these experiments and include the data after revision.

      Also - many of the phenotypes followed in this manuscript vary considerably with the length of G1 and the length of release. Which of these scenarios might mimic in vivo conditions?

      - We see that a prolonged arrest (> 2 days) is necessary to see genotoxic effects in RPE cells. Clinically, palbociclib is administered in 3-week on/1-week off cycles, therefore this is consistent with the possibility that replication stress is induced during the off periods to cause genotoxic damage and cell cycle withdrawal.

      Relating to the downregulation of MCM complex members, and the potential impact on origin licensing, how would this mechanism be manifest in cancer cells that have already deregulated gene transcription programs, and are already experiencing replication stress?

      - We hypothesise that cancer cells with ongoing replication stress maybe more sensitive to the MCM downregulation caused by CDK4/6 inhibition. The rationale is that a reduction in licenced origins would impair the ability of dormant origins to fire in response to replication problems, therefore making elevated levels of replication stress less tolerable. This is consistent with the enhanced effect of CDK4/6 inhibition seen when replication stress is elevated in RPE cells. Moreover, others have shown that experimentally reducing MCM protein levels induces hypersensitivity to replication stress in transformed cell lines such as U2OS and HeLa (Ge et al., 2007; Ibarra et al., 2008). Thus, low MCM levels and reduced origin licensing can contribute to replication failure in cancer cells.

      1. MCM protein levels and proposed impact on chromatin loading and origin licensing. Several MCM components are clearly reduced at the protein level. A chromatin assay (assaying fluorescence of signal remaining after pre-extraction of cytosolic proteins) suggests that MCM loading on chromatin is reduced, and this is taken to suggest a reduction in origin licensing. This is quite an indirect method - and it is difficult to conclude that the reduced chromatin bound fraction really represents a meaningful reduction in origin licensing. It would be more convincing if either positive and negative controls for this assay were included. Moreover it is not clear if this MCM reduction and proposed reduction in licensed origins would actually impact replication in an otherwise unperturbed state? Many more origins are licensed than actually fire during a normal S-phase, so it is not entirely clear that MCM levels could lead directly to replication stress here.

      - Quantifying the non-extractable MCM proteins is in truth the most direct assay for origin licensing (not origin firing) available in human cells. To our knowledge, there are no reports of MCM loading by this or similar assays that are not strongly correlated with origin licensing per se. The reviewer is correct that modest reductions in MCM loading are well-tolerated in the absence of other perturbations. Specifically, Ge et al found no proliferation effects after 50% MCM loading reduction, but any further reduction introduced a proliferation delay (Ge et al., 2007). Of note, the U2OS cells used in that study also have a functional p53 response.

      - Another important point that is worth emphasizing, is that many of the differentially downregulated proteins only function at replication forks (fig.4c). Therefore, we believe that the replication stress is a combined result of poor licencing and reduced levels of replication fork proteins that are needed after the origins fire. We will clarify this point in the revised manuscript.

      1. Loss of MCM protein levels and chromatin loading occurs after 1 day, not 4 days, of Cdk4/6 inhibition. The current proposal (based on evidence from the live cell imaging, and the induction of hallmarks of replication stress in figures 1-3) seems to be that something occurs between 2 and 7 days of cdk4/6i to prevent cells from resuming a normal cell cycle. Thus the proteomics was performed between 2 and 7 days, and MCM proteins identified as major changed proteins between those times. However, according to Western blots and FACS profiles in Figure 4, the major reduction in MCM protein levels, and chromatin loading occurs already at 1 day of of cdk4/6i (Figure 4d,e,f). However, replication stress is not observed after this timepoint (Figure 3) - so this seems to decouple the timings of MCM reduction from induction of replication stress. How can this be reconciled?

      - We agree that some of the observed changes to replisome components are quite considerable after just 1 day of arrest (some of these downregulations such as Cdc6 or phospho-Rb can be attributed to the cell cycle arrest itself - Cdc6 is unstable in G1 - but others, such MCM proteins, are not typically lost during G1). We were initially surprised by this too, considering that the phenotype clearly appears later than 1 day of arrest. It is important to state though, that the levels of almost all replisome components continue to decline as the duration of arrest is extended, eventually falling to considerably lower levels than seen after just 1 day. This is observed for MCM2, MCM3 and PCNA by western (fig.4e,e) and a large number of other replisome components by proteomics (fig.4c, 2 vs 7 days). Even MCM loading, which is 58% reduced after just 1-day arrest, is still reduced even further to just 20% of controls after 7 days (p- Our interpretation of the phenotypic data in light of this, is that replication problems become apparent when the number of licensed origins and the function of the replisome is compromised below a certain threshold; which most likely depends on cell type and, in particular, the levels of endogenous replication stress. So, in RPE cells, 1-day treatment is clearly tolerable, perhaps because there are still enough origins to complete DNA replication successfully. But, importantly, if replication stress is enhanced in these cells then 1-day of palbociclib arrest now starts to cause observable defects. This is evident in Figure 5h, where 1-day palbociclib treatment causes minimal effect on long-term growth on its own, but growth is reduced considerably when replication stress is elevated with genotoxic drugs. We interpret this to mean that the reduction in licenced origins and replisome components observed after 1 day of arrest, starts to become problematic in situations when replication stress is elevated.*

      - This is actually an important point that we will highlight this at revision, because one prediction is that other cells with elevated replication stress (e.g. tumour cells with oncogene-induced replication stress) may begin to see defects after as little as 1-day palbociclib arrest.

      **Minor points:**

      1. All the live cell tracking figures would be even more informative if a quantification of key features (such as a cumulative frequency of S-phase entry, or a mean+SD of time in G1, S and G2) were also presented.

      - We agree this will be useful, and we will include this information after revision.

      1. In Figure 2D the cells released from palbociclib seem to delay longer in G1 until they start to enter S phase, compared to cells co-treated with STLC (Figure 2B). Why would this be? It is difficult to tell if other subtle effects might be present in between the +STCL and -STLC conditions, so additional graphs such as those suggested above might be informative here in particular.

      - Fig.2d shows a representative experiment (50 cells) because it is difficult to interpret these individual cell cycle profiles when more than 50 cells are presented. However, we have all the data from 3 experiments (150 cells), therefore we will also calculate timings as suggested and present this information after revision.

      1. Figure 4f It would be helpful to see the FACS plot for at least one of the conditions quantified in the graph as a comparison.

      - These plots will be included after revision

      1. MCM2 protein is not down in p53 wt, but is reduced in p53 KO cells - why is this? And why is MCM2 not impacted when the other MCM complex members are?

      - We think perhaps there has been a mistake in interpreting these graphs. MCM2 is actually slightly lower in WT than KO cells at 1 days, and similar at 4 and 7 days (Fig.4d,e). MCM2 is also reduced slightly more than MCM3 (fig.4d,e) and MCM2, 3, 4, and 5 are all reduced by similar extents between 2 and 7 days palbociclib arrest (30-40% reductions; Fig.4c).

      Inducing aneuploidy with reversine to elevate replication stress may result in additional aneuploidy-related stresses that confound this interpretation. For example, aneuploidy per se is known to elevate p21 and p53 levels, and chromosome mis-segregation could elevate DNA damage. For these reasons these experiments are not as compelling as the direct elevation of replication stress using aphidicolin.

      - We agree that the aneuploidy experiment could have many different interpretations, and only one of these relates specifically to replication stress. This was also commented on by reviewer 3, so we feel it is best to remove this data and just keep the data on drugs that affect replication stress or DNA damage directly. We will address the effects of aneuploidy more extensively in a separate study.

      **Interesting points to follow up/add more mechanism**

      1. What is mechanism of protein downregulation of MCM etc? Was gene transcription impacted, or is this a question of protein stability? Depletion of one subunit can destabilise the complex leading to protein loss of the other MCM subunits, so perhaps this effect could be due to downregulation of a single MCM complex member.
      2. Are these findings specific to Cdk4/6 inhibitors, or would another means or arresting cells in G1 have the same impact?

      Both of these points are interesting questions and they are actually the focus of an entirely separate study that is ongoing. In particular, we are working on the mechanism(s) of MCM and replisome downregulation.

      Reviewer #1 (Significance (Required)): The central question of the paper is an important one so this work would be of interest to many in the clinical and preclinical fields, and also to the cell cycle and replication stress fields.

      - We thank the reviewer for this, and we agree that linking CDK4/6 inhibitors to genotoxic stress is important both for our understanding of cell cycle control and for cancer treatment. We are actually amazed that these drugs have not previously been linked to genotoxic stress, given that they appear to have broad pan-cancer activity and all other broad-spectrum anti-cancer drug work by causing genotoxic stress.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)): In this paper, Saurin and colleagues investigate the effects of CDK4/6 inhibitors on cell cycle arrest and re-entry. The authors report that long-term G1 arrest induced by CDK4/6i interferes with DNA replication during the next cell cycle, leading to DNA damage and mitotic catastrophe. Additionally, this compromised replication state sensitizes cells to chemotherapeutics that enhance replication stress. The major claims advanced in this paper are well-supported by the presented evidence. Well I have several questions regarding the significance (see below), I have only a few minor points regarding the methodology. 1) Regarding the down-regulation of MCM components induced by long-term palbo treatment shown in Figure 4: MCM levels are tightly regulated by cell cycle phase. I could imagine that this gene expression change may be a consequence of, for instance, 2 days CDK4/6i treatment arresting 95% of cells in G1 while 7 days of CDK4/6i treatment causes a 99.9% G1 arrest. The data in Figure 1B seems to argue against this hypothesis, but how was that data generated? Can the authors rule out a subtle change in S-phase % over 7 days in palbo? Alternately, is the down-regulation of MCM genes a consequence of cells entering senescence?

      - We have performed extensive long-term movies with these cells, and we never see cells dividing or exiting G1 after the first day of palbociclib treatment. This is illustrated in fig.1b which demonstrates that 100% of FUCCI cells are in G1 (Red) at each of the timepoints. This will be clarified in the legend. In addition, MCM protein levels do not actually oscillate with cell cycle phase (Matson et al., 2017; Méndez and Stillman, 2000), although their mRNA levels certainly do (Leone et al., 1998; Whitfield et al., 2002). Furthermore, RPE and mammalian fibroblasts retain MCM proteins after 2 days of growth factor withdrawal despite transcriptional repression of their respective genes **(Cook et al., 2002; Matson et al., 2019)

      - We see significant changes in MCM levels at a time when cells are still permissive to enter the cell cycle following drug release. Therefore, MCM reduction is not a consequence of senescence. Rather, we believe that it is one of the causes of cell cycle withdrawal following the subsequent S-phase.

      2) For the drug studies presented in figure 5, it is important that the authors perform the appropriate statistical comparisons and analyses to demonstrate true synergy. The authors show that combining palbo and certain chemotherapies causes a greater decrease in clonogenicity than palbo alone. This may or may not be surprising (see below) - but this by itself is insufficient to support the claim that palbo "sensitizes" cells to genotoxins. If you treat cells with two poisons, in 9 out of 10 cases, you'll kill more cells than if you treat cells with one poison alone. But that could be due to totally independent effects - see, for instance, Palmer and Sorger Cell 2017. There are several well-established statistical methods for investigating drug synergy - like Loewe Additivity or Bliss Independence - and one of these methods should be used to analyze the drug-combination studies presented in Figure 5.

      - This analysis will be performed at revision

      Reviewer #2 (Significance (Required)): While this study is a comprehensive analysis of the effects of CDK4/6i in RPE1 cells in 2d culture, I am not convinced of its broader significance. 1) So far as I can tell, the authors do not cite any studies establishing that CDK4/6i results in a significant increase in G1-arrested cells in treated patients. What evidence is there for this claim? I am aware that this has been demonstrated in xenografts and in mouse models, but I could not find evidence for this from actual clinical studies. Here, I am reminded of the very interesting work from Beth Weaver's group on paclitaxel - Zasadil STM 2014. While it had been widely assumed that paclitaxel causes a mitotic arrest, they actually show that this drug kills tumor cells by promoting mitotic catastrophe without inducing a complete mitotic arrest. Similarly, in the absence of existing clinical data, the underlying assumption regarding the effects of CDK4/6i that motivates this paper may not be accurate. For instance, if CDK4/6i acts through the immune system (as suggested by Jean Zhao and others), then this G1 arrest phenotype could be entirely secondary to the drug's actual mechanism-of-action.

      - We are very surprised by the suggestion that CDK4/6 inhibitors may not need to cause a G1 arrest in patient tumours. We appreciate that that these inhibitors effect the immune system in many different ways to combat tumourigenesis, but there is also an overwhelming amount of evidence that a G1 arrest in patient tumours is critical for the overall response. Perhaps the most striking evidence is the fact that RB loss in tumours is one of the best-characterised mechanism of resistance in breast cancer patients (Condorelli et al., 2018; Costa et al., 2020; Li et al., 2018; O'Leary et al., 2018; Wander et al., 2020). In addition, tumours types that typically achieve a poor CDK4/6i-induced G1 arrest in preclinical models, such as TNBCs, also exhibit a poor response to CDK4/6i therapy in patients. Recently a luminal androgen receptor subtype of TNBCs has been identified that responds to CDK4/6 inhibition, due to low CDK2 activity which can otherwise drive G1 progression independently of CDK4/6 in basal-like TNBCs (Asghar et al., 2017; Liu et al., 2017). This rationalises combination therapies that converge to inhibit G1 more effectively in this subtype (e.g. AR antagonist + CDK4/6 inhibition (Christenson et al., 2021)), which is akin to the oestrogen receptor and CDK4/6 combinations that have proven so successful at treating HR+ breast cancer. Many other combinations are also currently in trials based on the same premise that inhibiting upstream G1/S regulators can enhancing the response by inducing a more efficient G1 arrest (MEK, PI3K, AKT, mTOR) (Klein et al., 2018).

      - In response to the specific question about clinical G1 arrest in patients, tumour samples from breast cancer patients shows a decrease in S-phase specific markers pRB and TopoIIa following abemaciclib treatment (Patnaik et al., 2016) and there is extensive evidence of a profound cell cycle arrest following CDK4/6 inhibition as judged by staining with the mitotic marker Ki67 (Hurvitz et al., 2020; Johnston et al., 2019; Ma et al., 2017; Prat et al., 2020). Whilst this does not formally prove a G1-arrest is specifical responsible for this overall cell cycle arrest, that is the implicit assumption given the known mechanism of action of CDK4/6 inhibitors in cells.

      2) How relevant are RPE1 cells? Clinically, CDK4/6 inhibitors are combined with fulvestrant (which would not have an effect in RPE1), and the activity that they exhibit in breast cancer has not been matched in any other cancer types. The underlying biology of HR+ breast cancer (particularly regarding the regulation of CCND1 expression and the G1/S transition by estrogen) may not be recapitulated by other cell types. Moreover, the artificial media used in cell culture experiments may alter the regulation of the G1/S transition. I do not believe that these experiments conducted in RPE1 cells in 2d cell culture are generalizable.

      - Fulvestrant/tamoxifen are effective because they enhance the efficiency of a CDK4/6i arrest by reducing Cyclin D expression to enhance Cyclin D-CDK4/6 inhibition. That convergence onto the G1/S transition is why ER antagonists enhance the CDK4/6 response. i.e. CDK activity is inhibited and CycD transcription is reduced, therefore this double hit allows breast cancer cells to arrest in G1 more efficiently than healthy tissue which is not oestrogen-responsive (this provides yet more evidence the G1 arrest in tumours is crucial for the clinical response). It is true that RPE1 cells do not respond to the oestrogen treatment, but that is not really relevant here in our opinion. We are not testing the efficiency of a G1 arrest beyond the initial characterisation in figure 1. We are mainly examining how cells respond to that G1 arrest afterwards. It could be that components of the cell culture media affect that downstream response in unanticipated ways, but we feel that is very unlikely.

      - Having said that, we agree that the general point on the relevance of RPE cells is a valid one, and we will repeat key experiment in breast cancer cells. We suspect that the reason replisome components become widely downregulated during a G1 arrest will not be a specific phenomenon that is characteristic of one particular cell type. Nevertheless, it is important to validate that assumption.

      3) I am confused about the effects of CDK4/6i on genotoxin sensitivity. Replogle and Amon PNAS 2020 and several citations contained therein report that CDK4/6i protects cells from DNA damage. Moreover, trilaciclib has recently received FDA approval for its ability to protect the bone marrow from cytotoxic chemotherapy! Is this a question of dose timing/intensity? The FDA approval of trilaciclib for this indication should certainly be discussed. This underscores my concern that certain findings in this paper are RPE1/tissue culture artifacts, with limited generalizability.

      - The studies the reviewer refers to demonstrate that halting cell cycle progression can protect cells from genotoxic drugs that cause DNA damage during S-phase. However, we can only think that the reviewer must have missed the critical point here: The genotoxic agents in figure 5 were added after washout from CDK4/6 inhibition (we will highlight this more clearly in the revised manuscript). After drug removal, cells enter S-phase with replication competence problems (as a result of the CDK4/6 arrest) and they then experience additional problems during S-phase (as a result of the genotoxic agents included following washout). These effects synergise to enhance replication stress, a key conclusion of figure 5.

      - This does is in no way support that notion that “findings in this paper are RPE1/tissue culture artefacts with limited generalizability”. Experiments in 2D tissue culture have furnished some of the most important fundamental discoveries in cancer research. It remains to be seen whether our study will cause a paradigm shift in our thinking about how CDK4/6 inhibitors work, but we believe that it may do. We appreciate that this will not become clear until our findings are followed up and validated in preclinical models and human disease, but that does not, in our opinion, make them any less valid at this stage. As stated earlier, we will confirm this is not a RPE1 cell phenomenon, but if this holds up in breast cancer cells then we believe our data will have an important impact on future preclinical and clinical work in this area.

      **Referees cross-commenting** I think that we largely agree that RPE1 is not a great model for this study, and repeating certain key experiments in an ER+ BC line like MCF7 may be warranted.

      - We agree that it would add value to examine our findings in BC cells, therefore we will address this point at revision by repeating key experiments in BC cells.

      Additionally, I wanted to draw attention to the fact that, to my knowledge, the evidence for palbociclib inducing a G1 arrest in patients is incredibly spotty. For early-stage breast tumors where palbo is most effective, nearly all tumor cells are in G1 anyway. I think that it makes the most sense that palbo is actually working through immune modulation or through some secondary mechanism, rather than enforcing a G1 arrest. So I'm not sure about the premise of this study.

      - As discussed above, there is extensive evidence that proliferation is reduced in response to CDK4/6 inhibition in patients (Hurvitz et al., 2020; Johnston et al., 2019; Ma et al., 2017; Patnaik et al., 2016; Prat et al., 2020). We agree that proliferation in patient tumours can be slower than observed in preclinical models, and there can be many reasons for this, especially within solid tumour where hypoxia is a major factor that limits proliferation. However, we do not agree that this implies that drugs that target these tumours do not act on proliferating cells. In fact, most other broad-spectrum non-targeted chemotherapies used to treat cancer also work by targeting dividing cells, and many of these are also more effective in early stage breast cancer. In addition, and as discussed extensively above, there are many studies supporting the interpretation that a G1 arrest is critical for CDK4/6i response in breast cancer patients. Considering all of these points, we strongly believe that the premise of our study – to characterise why a G1 arrest becomes irreversible – is valid and important. This point Is also made in numerous recent reviews which also highlight that this key mechanistic information is currently lacking (Goel et al., 2018; Klein et al., 2018; Knudsen and Witkiewicz, 2017; Wagner and Gil, 2020).

      - We do not disagree that the immune effects are important in patients – indeed, we cited and discussed these studies in our manuscript. However, we would argue that this works together with a G1 arrest in tumour cells. The G1 arrest most likely induces a senescent response that stimulates immune engagement and tumour clearance. These multifactorial effect of CDK4/6 inhibition, on both the tumour and the immune system, are discussed at length in these reviews: (Goel et al., 2018; Klein et al., 2018; Wagner and Gil, 2020).

      Reviewer #3 (Evidence, reproducibility and clarity (Required)): The authors clearly demonstrate, with appropriate techniques, that cells treated with clinically relevant CDK4/6 inhibitors lead to a cell cycle arrest, that is only partly reversible. The authors also demonstrate clearly that release from a cdk4/6i arrest leads to two phenomena: the inability to initiate S-phase, and a cell cycle exit in G2. The inability to initiate S-phase is partly dependent on p53, the cell cycle exit is fully dependent on p53. In the absence of p53, cells that are released from a CDK4/6i block frequently enter mitosis with unrepaired DNA lesions. The authors clearly demonstrate that cdk4/6 inhibition leads to down regulation of key replication genes. Combined treatment with genotoxic agents further exaggerates the phenotype of cell cycle exit upon cdk4/6 inhibition. **Specific comments:** Figure 1B: the loss of reversibility remains at approximately 50%. Does the phenotype of replication protein depletion not happen in the 50% of cells that do restart the cell cycle? it would be good if the authors could experimentally address the heterogeneity that is observed.

      - This is actually a result of the fixed analysis use in fig.1B. The irreversibility is much higher than 50% after long durations of arrest, but at the 24h timepoint used in this fixed assay many cells have exited G1 but not yet had a chance to revert back into G1 from S/G2 phase. We will reinforce this point in the legend. This highlights the value of our extensive live cell assays that can fully capture cell cycle profiles, and accurately determine when cell do/don’t enter or withdraw from different stages of the cell cycle. We believe that an overreliance of fixed endpoints in previous studies may have contributed to the genotoxic effects in S-phase being missed previously: many studies show senescence after drug washout, but the cause of that senescence only becomes apparent when you observe that cells withdraw with defects after the first S-phase.

      Figure 1C: the G1 state after S-phase. The read-out here is loss of the Fucci reporter geminin. Does observation reflect p53-dependent activation of the APC/C-Cdh1 prematerely? this is a known effect of persistent DNA damage in G2 cells.

      - Yes, we expect that APC/C-Cdh1 activation causes geminin and cyclin degradation when cells permanently withdraw from the cell cycle from G2. This is likely caused by p53-dependent p21 activation in response to DNA replication defects, as has been shown previously in direct response to DNA damage.

      Figure 2: there seem to be two distinct phenotypes when comparing p53-wt and p53-KO: the ability to initiate S-phase after CDK4/6i removal (which is largely gone in p53 KO, only slight number after 7d treatment). And cell cycle-drop-out after S-phase (this seems to be fully p53 dependent). I am not sure if a single mechanisms explains both.

      - We agree that there are p53-dependent effects on speed/extent of S-phase entry and on the resulting withdrawal from G2. It may not be a single mechanism that connected these effects, although they may be related. Our manuscript mainly focusses on the DNA replication defects and cell cycle withdrawal, but in the future, it will be important to also characterise what causes the delay in cell cycle re-entry following CDK4/6 inhibition. We suspect that this could reflect differing depths of quiescence, potentially caused by p21, which would explain the p53-dependence.

      Figure 3a: related to the proviso point. it is unclear if the p21 up regulation happens in G1 or G2 cells, and related to the inability of cells to initiate S-phase, or the cell cycle exit in G2.

      - This is a good point, and as discussed above, we suspect both maybe related to p21. We will examine p21 levels during a G1 arrest to compare to the levels seen following release, and we will include this data after revision.

      It is stated that a combined action of the p53 pathways and ATR signaling prevent mitotic entry in RPE-wt cells. However, ATR should also be able to do this in p53-KO cells. Does cdk4/6i inhibiton also down-regulation of ATR pathway components?

      - We do not detect downregulation of any ATRi components in the mass spec data comparing 2 and 7 day palbociclib arrest.

      Following the observation that CDK4/6i leads to replication stress, I would hypothesise that these cells would be very sensitive to agents that inhibit the response to replication stress (inhibitors of Wee1, ATR or Chk1). Yet, these agents work preferentially in p53-deficient cells, and require cell cycle progression. Sequential treatment with CDK4/6 inhibition followed by cell cycle checkpoint inhibition may help in uncovering the phenotype.

      - This is a good point and we will perform experiments with ATR inhibitors after release from CDK4/6 inhibition to examine if this enhances the phenotype.

      The authors increase the amount of replication stress using chemotherapeutic approaches or MPS1 inhibitors. The chemotherapeutic approaches are relevant clinically, but mechanistically it don't understand this beyond adding up treatments that lead to replication defects.

      - We agree that the main value of these experiments is not to provide mechanistic insight, but rather to demonstrate that CDK4/6 inhibition can enhance the effect of current genotoxic drugs. Considering CDK4/6 inhibitors are well-tolerated, this could represent an effective way to enhance the tumour-selectivity of current genotoxic therapeutics. This has been suggested previously in a pancreatic cancer study (Salvador-Barbero et al., 2020), but the reasons given for synergy were different (DNA damage repair) and the order of drugs exposure was reversed (genotoxic before CDK4/6i). This underscores the potential importance of our new data.

      - From a mechanistic point of view, these data do still suggest that CDK4/6i and genotoxic drugs converge onto the same replication stress phenotype, thereby supporting our overall conclusions. One interpretation is that a reduction in replisome levels and licenced replication origins impairs the ability of cells to overcome replication problems induced by chemotherapy drugs. Conceptualising how these drugs may synergize in this way will be important in designing new studies and trials to address this synergy more broadly.

      The aneuploidy treatment is a bit weird, because it may trigger a p53 response, before the cells are released from a cdk4.6i arrest. besides, mps1 inhibition does more than just cause replication stress and is not very clinically relevant in this context.

      - We agree that the aneuploidy experiment could have many different interpretations, and only one of these relates specifically to replication stress. This was also commented on by reviewer 1, so we feel it is best to remove this data and just keep the data on drugs that affect replication stress or DNA damage directly. We will address the effects of aneuploidy more extensively in a separate study.

      Reviewer #3 (Significance (Required)): In their manuscript entitled: Crozier and co-workers studied the effects of CDK4/6 inhibition on cell growth. CDK4/6 inhibitors are currently used in the treatment for hormone-positive breast cancers, but their cell biological effects on tumor cells remain incompletely clear, which may hamper the further clinical development of these drugs for breast cancer or other cancers. Inhibition of CDK4/6 is known to trigger a cell cycle arrest, and it is currently unclear how this could lead to long-term tumor control. This manuscript addresses the question why cdk4/6 inhibitors cause long-term cell cycle exit.

      - We thank the reviewer for this simple description of our work, which we think pitches the significance very clearly. There are currently 15 different CDK4/6 inhibitors in clinical trials, and more than 100 further trials using the 3 currently licenced inhibitors in a wide variety of tumour types and drug combinations. Although the clinical work on these drugs is huge, it is unclear how they cause long-term cell cycle arrest and we now link this to genotoxic stress for the first time. This explains clearly why this work is potentially very significant. We agree, however, that the main caveat is the need to demonstrate our findings are also applicable to breast cancer cells. But, if this is the case, we believe this would represent a paradigm shift in our understanding of how these drugs work, especially considering that genomic damage is an universal route to prolonged cell cycle exit in response to almost all other broad-spectrum anti-cancer drugs.

      There are two issues that affect the significance of the findings: the authors start their manuscript with a strong translational/clinical issue, but solely use RPE1 cell lines to address this issue2. it remains unclear if their observations hold true in breast cancer models. it would be advised to repeat key findings in a hormone receptor-positive breast cancer model.

      - We will examine the applicability of our findings in breast cancer cells and include this work at revision.

      the effects of CDK4/6 inhibitors are observed in clinically relevant doses. however, the effects are observed upon switch-like wash out. this does not per se reflect the pharmacodynamics of more gradual increase and decrease of drug concentrations in tuner cells. by washing out the CDK4/6 inhibitors. the significant of this work would be greater if cell cycle exit with replication stress would be observed either in clinical samples or in vivo treated cancer cells.

      - We agree that the significance of this work will ultimately only become fully apparent if replication stress is confirmed in clinical samples or in vivo. We envisage that our study will stimulate exactly this type of analysis in future. However, we would also add that the gradual increase/decrease in drug concentrations seen in patients is still likely to lead to switch like cell cycle re-entry given the switch-like nature of cell cycle controls at the G1/S transition. So, the timing may be different, but we would not predict that the downstream response in S-phase would be. However, whether replication stress is seen during drug-free washout periods in patients is clearly a critical future question, as we highlight in the discussion.

      References

      Asghar, U.S., Barr, A.R., Cutts, R., Beaney, M., Babina, I., Sampath, D., Giltnane, J., Lacap, J.A., Crocker, L., Young, A., et al. (2017). Single-Cell Dynamics Determines Response to CDK4/6 Inhibition in Triple-Negative Breast Cancer. Clin Cancer Res 23, 5561-5572.

      Christenson, J.L., O'Neill, K.I., Williams, M.M., Spoelstra, N.S., Jones, K.L., Trahan, G.D., Reese, J., Van Patten, E.T., Elias, A., Eisner, J.R., et al. (2021). Activity of combined androgen receptor antagonism and cell cycle inhibition in androgen receptor-positive triple-negative breast cancer. Mol Cancer Ther.

      Condorelli, R., Spring, L., O'Shaughnessy, J., Lacroix, L., Bailleux, C., Scott, V., Dubois, J., Nagy, R.J., Lanman, R.B., Iafrate, A.J., et al. (2018). Polyclonal RB1 mutations and acquired resistance to CDK 4/6 inhibitors in patients with metastatic breast cancer. Annals of oncology : official journal of the European Society for Medical Oncology 29, 640-645.

      Cook, J.G., Park, C.H., Burke, T.W., Leone, G., DeGregori, J., Engel, A., and Nevins, J.R. (2002). Analysis of Cdc6 function in the assembly of mammalian prereplication complexes. Proceedings of the National Academy of Sciences of the United States of America 99, 1347-1352.

      Costa, C., Wang, Y., Ly, A., Hosono, Y., Murchie, E., Walmsley, C.S., Huynh, T., Healy, C., Peterson, R., Yanase, S., et al. (2020). PTEN Loss Mediates Clinical Cross-Resistance to CDK4/6 and PI3Kα Inhibitors in Breast Cancer. Cancer Discov 10, 72-85.

      Ge, X.Q., Jackson, D.A., and Blow, J.J. (2007). Dormant origins licensed by excess Mcm2-7 are required for human cells to survive replicative stress. Genes Dev 21, 3331-3341.

      Goel, S., DeCristo, M.J., McAllister, S.S., and Zhao, J.J. (2018). CDK4/6 Inhibition in Cancer: Beyond Cell Cycle Arrest. Trends Cell Biol 28, 911-925.

      Hurvitz, S.A., Martin, M., Press, M.F., Chan, D., Fernandez-Abad, M., Petru, E., Rostorfer, R., Guarneri, V., Huang, C.S., Barriga, S., et al. (2020). Potent Cell-Cycle Inhibition and Upregulation of Immune Response with Abemaciclib and Anastrozole in neoMONARCH, Phase II Neoadjuvant Study in HR(+)/HER2(-) Breast Cancer. Clin Cancer Res 26, 566-580.

      Ibarra, A., Schwob, E., and Méndez, J. (2008). Excess MCM proteins protect human cells from replicative stress by licensing backup origins of replication. Proceedings of the National Academy of Sciences of the United States of America 105, 8956-8961.

      Johnston, S., Puhalla, S., Wheatley, D., Ring, A., Barry, P., Holcombe, C., Boileau, J.F., Provencher, L., Robidoux, A., Rimawi, M., et al. (2019). Randomized Phase II Study Evaluating Palbociclib in Addition to Letrozole as Neoadjuvant Therapy in Estrogen Receptor-Positive Early Breast Cancer: PALLET Trial. J Clin Oncol 37, 178-189.

      Klein, M.E., Kovatcheva, M., Davis, L.E., Tap, W.D., and Koff, A. (2018). CDK4/6 Inhibitors: The Mechanism of Action May Not Be as Simple as Once Thought. Cancer Cell 34, 9-20.

      Knudsen, E.S., and Witkiewicz, A.K. (2017). The Strange Case of CDK4/6 Inhibitors: Mechanisms, Resistance, and Combination Strategies. Trends in cancer 3, 39-55.

      Leone, G., DeGregori, J., Yan, Z., Jakoi, L., Ishida, S., Williams, R.S., and Nevins, J.R. (1998). E2F3 activity is regulated during the cell cycle and is required for the induction of S phase. Genes Dev 12, 2120-2130.

      Li, Z., Razavi, P., Li, Q., Toy, W., Liu, B., Ping, C., Hsieh, W., Sanchez-Vega, F., Brown, D.N., Da Cruz Paula, A.F., et al. (2018). Loss of the FAT1 Tumor Suppressor Promotes Resistance to CDK4/6 Inhibitors via the Hippo Pathway. Cancer Cell 34, 893-905.e898.

      Liu, C.Y., Lau, K.Y., Hsu, C.C., Chen, J.L., Lee, C.H., Huang, T.T., Chen, Y.T., Huang, C.T., Lin, P.H., and Tseng, L.M. (2017). Combination of palbociclib with enzalutamide shows in vitro activity in RB proficient and androgen receptor positive triple negative breast cancer cells. PloS one 12, e0189007.

      Ma, C.X., Gao, F., Luo, J., Northfelt, D.W., Goetz, M., Forero, A., Hoog, J., Naughton, M., Ademuyiwa, F., Suresh, R., et al. (2017). NeoPalAna: Neoadjuvant Palbociclib, a Cyclin-Dependent Kinase 4/6 Inhibitor, and Anastrozole for Clinical Stage 2 or 3 Estrogen Receptor-Positive Breast Cancer. Clin Cancer Res 23, 4055-4065.

      Matson, J.P., Dumitru, R., Coryell, P., Baxley, R.M., Chen, W., Twaroski, K., Webber, B.R., Tolar, J., Bielinsky, A.K., Purvis, J.E., et al. (2017). Rapid DNA replication origin licensing protects stem cell pluripotency. eLife 6.

      Matson, J.P., House, A.M., Grant, G.D., Wu, H., Perez, J., and Cook, J.G. (2019). Intrinsic checkpoint deficiency during cell cycle re-entry from quiescence. J Cell Biol 218, 2169-2184.

      Méndez, J., and Stillman, B. (2000). Chromatin association of human origin recognition complex, cdc6, and minichromosome maintenance proteins during the cell cycle: assembly of prereplication complexes in late mitosis. Mol Cell Biol 20, 8602-8612.

      O'Leary, B., Cutts, R.J., Liu, Y., Hrebien, S., Huang, X., Fenwick, K., André, F., Loibl, S., Loi, S., Garcia-Murillas, I., et al. (2018). The Genetic Landscape and Clonal Evolution of Breast Cancer Resistance to Palbociclib plus Fulvestrant in the PALOMA-3 Trial. Cancer Discov 8, 1390-1403.

      Patnaik, A., Rosen, L.S., Tolaney, S.M., Tolcher, A.W., Goldman, J.W., Gandhi, L., Papadopoulos, K.P., Beeram, M., Rasco, D.W., Hilton, J.F., et al. (2016). Efficacy and Safety of Abemaciclib, an Inhibitor of CDK4 and CDK6, for Patients with Breast Cancer, Non-Small Cell Lung Cancer, and Other Solid Tumors. Cancer Discov 6, 740-753.

      Prat, A., Saura, C., Pascual, T., Hernando, C., Muñoz, M., Paré, L., González Farré, B., Fernández, P.L., Galván, P., Chic, N., et al. (2020). Ribociclib plus letrozole versus chemotherapy for postmenopausal women with hormone receptor-positive, HER2-negative, luminal B breast cancer (CORALLEEN): an open-label, multicentre, randomised, phase 2 trial. Lancet Oncol 21, 33-43.

      Salvador-Barbero, B., Álvarez-Fernández, M., Zapatero-Solana, E., El Bakkali, A., Menéndez, M.D.C., López-Casas, P.P., Di Domenico, T., Xie, T., VanArsdale, T., Shields, D.J., et al. (2020). CDK4/6 Inhibitors Impair Recovery from Cytotoxic Chemotherapy in Pancreatic Adenocarcinoma. Cancer Cell 37, 340-353.e346.

      Wagner, V., and Gil, J. (2020). Senescence as a therapeutically relevant response to CDK4/6 inhibitors. Oncogene.

      Wander, S.A., Cohen, O., Gong, X., Johnson, G.N., Buendia-Buendia, J.E., Lloyd, M.R., Kim, D., Luo, F., Mao, P., Helvie, K., et al. (2020). The genomic landscape of intrinsic and acquired resistance to cyclin-dependent kinase 4/6 inhibitors in patients with hormone receptor positive metastatic breast cancer. Cancer Discov.

      Whitfield, M.L., Sherlock, G., Saldanha, A.J., Murray, J.I., Ball, C.A., Alexander, K.E., Matese, J.C., Perou, C.M., Hurt, M.M., Brown, P.O., et al. (2002). Identification of genes periodically expressed in the human cell cycle and their expression in tumors. Mol Biol Cell 13, 1977-2000.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      The authors clearly demonstrate, with appropriate techniques, that cells treated with clinically relevant CDK4/6 inhibitors lead to a cell cycle arrest, that is only partly reversible.

      The authors also demonstrate clearly that release from a cdk4/6i arrest leads to two phenomena: the inability to initiate S-phase, and a cell cycle exit in G2.

      The inability to initiate S-phase is partly dependent on p53, the cell cycle exit is fully dependent on p53.

      In the absence of p53, cells that are released from a CDK4/6i block frequently enter mitosis with unrepaired DNA lesions.

      The authors clearly demonstrate that cdk4/6 inhibition leads to down regulation of key replication genes.

      Combined treatment with genotoxic agents further exaggerates the phenotype of cell cycle exit upon cdk4/6 inhibition.

      Specific comments:

      Figure 1B: the loss of reversibility remains at approximately 50%. Does the phenotype of replication protein depletion not happen in the 50% of cells that do restart the cell cycle? it would be good if the authors could experimentally address the heterogeneity that is observed.

      Figure 1C: the G1 state after S-phase. The read-out here is loss of the Fucci reporter geminin. Does observation reflect p53-dependent activation of the APC/C-Cdh1 prematerely? this is a known effect of persistent DNA damage in G2 cells.

      Figure 2: there seem to be two distinct phenotypes when comparing p53-wt and p53-KO: the ability to initiate S-phase after CDK4/6i removal (which is largely gone in p53 KO, only slight number after 7d treatment). And cell cycle-drop-out after S-phase (this seems to be fully p53 dependent). I am not sure if a single mechanisms explains both.

      Figure 3a: related to the proviso point. it is unclear if the p21 up regulation happens in G1 or G2 cells, and related to the inability of cells to initiate S-phase, or the cell cycle exit in G2.

      It is stated that a combined action of the p53 pathways and ATR signaling prevent mitotic entry in RPE-wt cells. However, ATR should also be able to do this in p53-KO cells. Does cdk4/6i inhibiton also down-regulation of ATR pathway components?

      Following the observation that CDK4/6i leads to replication stress, I would hypothesise that these cells would be very sensitive to agents that inhibit the response to replication stress (inhibitors of Wee1, ATR or Chk1). Yet, these agents work preferentially in p53-deficient cells, and require cell cycle progression. Sequential treatment with CDK4/6 inhibition followed by cell cycle checkpoint inhibition may help in uncovering the phenotype.

      The authors increase the amount of replication stress using chemotherapeutic approaches or MPS1 inhibitors. The chemotherapeutic approaches are relevant clinically, but mechanistically it don't understand this beyond adding up treatments that lead to replication defects.

      The aneuploidy treatment is a bit weird, because it may trigger a p53 response, before the cells are released from a cdk4.6i arrest. besides, mps1 inhibition does more than just cause replication stress and is not very clinically relevant in this context.

      Significance

      In their manuscript entitled: Crozier and co-workers studied the effects of CDK4/6 inhibition on cell growth. CDK4/6 inhibitors are currently used in the treatment for hormone-positive breast cancers, but their cell biological effects on tumor cells remain incompletely clear, which may hamper the further clinical development of these drugs for breast cancer or other cancers.

      Inhibition of CDK4/6 is known to trigger a cell cycle arrest, and it is currently unclear how this could lead to long-term tumor control. This manuscript addresses the question why cdk4/6 inhibitors cause long-term cell cycle exit.

      There are two issues that affect the significance of the findings:

      -the authors start their manuscript with a strong translational/clinical issue, but solely use RPE1 cell lines to address this issue2. it remains unclear if their observations hold true in breast cancer models. it would be advised to repeat key findings in a hormone receptor-positive breast cancer model.

      -the effects of CDK4/6 inhibitors are observed in clinically relevant doses. however, the effects are observed upon switch-like wash out. this does not per se reflect the pharmacodynamics of more gradual increase and decrease of drug concentrations in tuner cells. by washing out the CDK4/6 inhibitors. the significant of this work would be greater if cell cycle exit with replication stress would be observed either in clinical samples or in vivo treated cancer cells.

      -the effects of CDK4/6 inhibitors are observed in clinically relevant doses. however, the effects are observed upon switch-like wash out. this does not per se reflect the pharmacodynamics of more gradual increase and decrease of drug concentrations in tuner cells. by washing out the CDK4/6 inhibitors. the significant of this work would be greater if cell cycle exit with replication stress would be observed either in clinical samples or in vivo treated cancer cells.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      In this paper, Saurin and colleagues investigate the effects of CDK4/6 inhibitors on cell cycle arrest and re-entry. The authors report that long-term G1 arrest induced by CDK4/6i interferes with DNA replication during the next cell cycle, leading to DNA damage and mitotic catastrophe. Additionally, this compromised replication state sensitizes cells to chemotherapeutics that enhance replication stress.

      The major claims advanced in this paper are well-supported by the presented evidence. Well I have several questions regarding the significance (see below), I have only a few minor points regarding the methodology.

      1) Regarding the down-regulation of MCM components induced by long-term palbo treatment shown in Figure 4: MCM levels are tightly regulated by cell cycle phase. I could imagine that this gene expression change may be a consequence of, for instance, 2 days CDK4/6i treatment arresting 95% of cells in G1 while 7 days of CDK4/6i treatment causes a 99.9% G1 arrest. The data in Figure 1B seems to argue against this hypothesis, but how was that data generated? Can the authors rule out a subtle change in S-phase % over 7 days in palbo?

      Alternately, is the down-regulation of MCM genes a consequence of cells entering senescence?

      2) For the drug studies presented in figure 5, it is important that the authors perform the appropriate statistical comparisons and analyses to demonstrate true synergy. The authors show that combining palbo and certain chemotherapies causes a greater decrease in clonogenicity than palbo alone. This may or may not be surprising (see below) - but this by itself is insufficient to support the claim that palbo "sensitizes" cells to genotoxins. If you treat cells with two poisons, in 9 out of 10 cases, you'll kill more cells than if you treat cells with one poison alone. But that could be due to totally independent effects - see, for instance, Palmer and Sorger Cell 2017. There are several well-established statistical methods for investigating drug synergy - like Loewe Additivity or Bliss Independence - and one of these methods should be used to analyze the drug-combination studies presented in Figure 5.

      Significance

      While this study is a comprehensive analysis of the effects of CDK4/6i in RPE1 cells in 2d culture, I am not convinced of its broader significance.

      1) So far as I can tell, the authors do not cite any studies establishing that CDK4/6i results in a significant increase in G1-arrested cells in treated patients. What evidence is there for this claim? I am aware that this has been demonstrated in xenografts and in mouse models, but I could not find evidence for this from actual clinical studies. Here, I am reminded of the very interesting work from Beth Weaver's group on paclitaxel - Zasadil STM 2014. While it had been widely assumed that paclitaxel causes a mitotic arrest, they actually show that this drug kills tumor cells by promoting mitotic catastrophe without inducing a complete mitotic arrest. Similarly, in the absence of existing clinical data, the underlying assumption regarding the effects of CDK4/6i that motivates this paper may not be accurate. For instance, if CDK4/6i acts through the immune system (as suggested by Jean Zhao and others), then this G1 arrest phenotype could be entirely secondary to the drug's actual mechanism-of-action.

      2) How relevant are RPE1 cells? Clinically, CDK4/6 inhibitors are combined with fulvestrant (which would not have an effect in RPE1), and the activity that they exhibit in breast cancer has not been matched in any other cancer types. The underlying biology of HR+ breast cancer (particularly regarding the regulation of CCND1 expression and the G1/S transition by estrogen) may not be recapitulated by other cell types. Moreover, the artificial media used in cell culture experiments may alter the regulation of the G1/S transition. I do not believe that these experiments conducted in RPE1 cells in 2d cell culture are generalizable.

      3) I am confused about the effects of CDK4/6i on genotoxin sensitivity. Replogle and Amon PNAS 2020 and several citations contained therein report that CDK4/6i protects cells from DNA damage. Moreover, trilaciclib has recently received FDA approval for its ability to protect the bone marrow from cytotoxic chemotherapy! Is this a question of dose timing/intensity? The FDA approval of trilaciclib for this indication should certainly be discussed. This underscores my concern that certain findings in this paper are RPE1/tissue culture artifacts, with limited generalizability.

      Referees cross-commenting

      I think that we largely agree that RPE1 is not a great model for this study, and repeating certain key experiments in an ER+ BC line like MCF7 may be warranted.

      Additionally, I wanted to draw attention to the fact that, to my knowledge, the evidence for palbociclib inducing a G1 arrest in patients is incredibly spotty. For early-stage breast tumors where palbo is most effective, nearly all tumor cells are in G1 anyway. I think that it makes the most sense that palbo is actually working through immune modulation or through some secondary mechanism, rather than enforcing a G1 arrest. So I'm not sure about the premise of this study.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Comments on 'CDK4/6 inhibitors induce replication stress to cause long-term cell cycle withdrawal'

      The rationale for this work is to understand the mechanism by which Cdk4/6 inhibitors inhibit tumour cell growth, specifically via senescence which seems to be a frequent outcome of Cdk4/6 inhibition. Although several mechanisms by which Cdk4/6 inhibition induce senescence have been proposed these have varied with the cancer cell model studied. To examine the mechanism for the cytostatic effect of cdk4/6i in therapy without potential confounding effects of different cancer cell line backgrounds, Crozier et al tackle this question in the non-transformed, immortalised diploid human cell line, RPE1. They use live cell imaging and colony formation to track the impact of G1 arrests of different lengths induced by a range of clinically relevant cdk4/6 inhibitors. They also use CRISPR-mediated removal of p53 to examine the role of p53 in the observed cell cycle responses. After noting that G1 arrest of over 2 days leads to a pronounced failure in continued cell cycle and proliferation that is associated with features of replication stress, they perform a proteomics analysis to determine the factors responsible for this. They discover that MCM complex components and some other replicative proteins are downregulated and overall suggest a mechanism whereby downregulation of these essential replication components during a prolonged G1 induce replication stress and ultimate failure of proliferation. They show the impact of cdk4/6 inhibition can be increased by combining with either aneuploidy induction (to indirectly elevate replication stress), aphidicolin (to directly elevate replication stress) or chemotherapy agents that damage DNA.

      Overall this is a well written and presented manuscript. Data are extremely clearly presented and described clearly within the text. Most appropriate controls were included and the work is performed to a high standard. I have a few comments about the proteomic analysis, and the link between MCM component deregulation and the induction of replication stress:

      Major points:

      1. Relevance to cancer. I appreciate that examining the mechanism in a diploid line is a sensible place to start. However it remains a bit unclear precisely which aspects of this mechanism might be conserved in cancer. It could be helpful to provide evidence (if it exists) of the impact of cdk4/6 inhibition in tumour cells. For example, are catastrophic mitosis, senescence, etc observed? And is there anything further known about the relationship between tumour mutations such as p53 and clinical response to Cdk4/6i? Also - many of the phenotypes followed in this manuscript vary considerably with the length of G1 and the length of release. Which of these scenarios might mimic in vivo conditions? Relating to the downregulation of MCM complex members, and the potential impact on origin licensing, how would this mechanism be manifest in cancer cells that have already deregulated gene transcription programs, and are already experiencing replication stress?
      2. MCM protein levels and proposed impact on chromatin loading and origin licensing. Several MCM components are clearly reduced at the protein level. A chromatin assay (assaying fluorescence of signal remaining after pre-extraction of cytosolic proteins) suggests that MCM loading on chromatin is reduced, and this is taken to suggest a reduction in origin licensing. This is quite an indirect method - and it is difficult to conclude that the reduced chromatin bound fraction really represents a meaningful reduction in origin licensing. It would be more convincing if either positive and negative controls for this assay were included. Moreover it is not clear if this MCM reduction and proposed reduction in licensed origins would actually impact replication in an otherwise unperturbed state? Many more origins are licensed than actually fire during a normal S-phase, so it is not entirely clear that MCM levels could lead directly to replication stress here.
      3. Loss of MCM protein levels and chromatin loading occurs after 1 day, not 4 days, of Cdk4/6 inhibition. The current proposal (based on evidence from the live cell imaging, and the induction of hallmarks of replication stress in figures 1-3) seems to be that something occurs between 2 and 7 days of cdk4/6i to prevent cells from resuming a normal cell cycle. Thus the proteomics was performed between 2 and 7 days, and MCM proteins identified as major changed proteins between those times. However, according to Western blots and FACS profiles in Figure 4, the major reduction in MCM protein levels, and chromatin loading occurs already at 1 day of of cdk4/6i (Figure 4d,e,f). However, replication stress is not observed after this timepoint (Figure 3) - so this seems to decouple the timings of MCM reduction from induction of replication stress. How can this be reconciled?

      Minor points:

      1. All the live cell tracking figures would be even more informative if a quantification of key features (such as a cumulative frequency of S-phase entry, or a mean+SD of time in G1, S and G2) were also presented.
      2. In Figure 2D the cells released from palbociclib seem to delay longer in G1 until they start to enter S phase, compared to cells co-treated with STLC (Figure 2B). Why would this be? It is difficult to tell if other subtle effects might be present in between the +STCL and -STLC conditions, so additional graphs such as those suggested above might be informative here in particular.
      3. Figure 4f It would be helpful to see the FACS plot for at least one of the conditions quantified in the graph as a comparison.
      4. MCM2 protein is not down in p53 wt, but is reduced in p53 KO cells - why is this? And why is MCM2 not impacted when the other MCM complex members are?
      5. Inducing aneuploidy with reversine to elevate replication stress may result in additional aneuploidy-related stresses that confound this interpretation. For example, aneuploidy per se is known to elevate p21 and p53 levels, and chromosome mis-segregation could elevate DNA damage. For these reasons these experiments are not as compelling as the direct elevation of replication stress using aphidicolin.

      Interesting points to follow up/add more mechanism

      1. What is mechanism of protein downregulation of MCM etc? Was gene transcription impacted, or is this a question of protein stability? Depletion of one subunit can destabilise the complex leading to protein loss of the other MCM subunits, so perhaps this effect could be due to downregulation of a single MCM complex member.
      2. Are these findings specific to Cdk4/6 inhibitors, or would another means or arresting cells in G1 have the same impact?

      Significance

      The central question of the paper is an important one so this work would be of interest to many in the clinical and preclinical fields, and also to the cell cycle and replication stress fields.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      We are grateful to the editors at Review Commons and to the reviewers for their thoughtful attention to our manuscript. Our work presents data showing that deletion of the apoptosis regulator Mcl-1 in CNS stem cells that give rise to neurons and glia resulted in specific degeneration of the white matter, beginning after postnatal day 7 (P7). Cellular analysis shows that oligodendrocytes were depleted while astrocytes persisted. Co-deletion of apoptosis effectors Bax or Bak rescued different aspects of the Mcl-1 deletion phenotype, confirming the role of apoptosis. Based on these observations, we conclude that oligodendrocytes require MCL-1 to prevent spontaneous apoptosis, and that MCL-1 depletion results in leukodystrophy, which resembles severe cases of the human disorder Vanishing White Matter Disease (VWMD). We further suggest that MCL-1 deficiency, caused by the eIF2B mutations of VWMD, may play a critical role in VWMD pathogenesis.

      The reviewers questioned the similarity of the Mcl-1 deletion phenotype to VWMD and were not convinced that MCL-1 deficiency is integral to VWMD. Based on reviewer feedback, we concede that a firm link to VWMD is not supported by the available data. We consider, however, that our findings that MCL-1 is required for oligodendrocyte survival and white matter stability remain highly significant. Accordingly, we propose to revise the work as suggested by Reviewer 1 to highlight the insight our data provide as to apoptosis regulation in glia and its importance for brain development, and to revise the title, as suggested by Reviewer 3, to remove the specific reference to VWMD.

      In the revision, we will make clear that the comparison to specific leukodystrophies is hypothetical and will require extensive follow-up experiments that are suggested by the findings of this work, as described in the reviews. Revising our work by removing the assertion that our data strongly implicate MCL-1 in VWMD pathogenesis will address the main reviewer concern, strengthen the logical flow, and highlight the potential for MCL-1 to be broadly relevant to white matter pathology. The significance of our findings that oligodendrocytes depend on MCL-1 protein to prevent their spontaneous apoptosis, and that MCL-1 deficiency produces white matter degeneration, will not be altered by these changes. Our data will continue to show that MCL-1 dependence is a physiologic vulnerability of oligodendrocytes that sets them apart from astrocytes and neurons and that this vulnerability is sufficient to cause white matter-specific brain degeneration when MCL-1 expression is blocked.

      The other issues raised by the reviewers are all tractable and can be addressed with new experiments that we can complete in a short time-frame, such as studies of retinal pathology and addition immunohistochemistry studies, or with changes to the text. We consider that with these revisions, the manuscript will be an important contribution to understanding glial biology and the pathogenesis of white matter-specific disorders. We describe in detail below our responses to reviewer feedback and planned changes to the manuscript.

      Reviewer comments are in italics and our responses are in plain text.

      *Reviewer #1 (Evidence, reproducibility and clarity (Required)):**

      **Comments** *

      While we acknowledge many important points in this review, this first point is based on a premise that is inaccurate. Based on published data, we respectfully disagree with the statement that “Depletion of MCL-1 in any tissue would promote apoptosis in cells of this tissue”. Most cells do not require an anti-apoptotic protein to prevent spontaneous apoptosis; cells that depend on anti-apoptotic proteins are specifically referred to as “primed for apoptosis” (1-5). Our conditional deletion genotypes ablated Mcl-1 in neurons of the forebrain and cerebellum and in all subtypes of glial cells. The loss of oligodendrocytes in our Mcl-1-deleted mice shows that a specific subset of white matter cells in the postnatal brain require MCL-1. Together with the increase in apoptosis and the rescues by co-deletion of Bax or Bak, these data demonstrate that cells within the oligodendrocyte lineage are primed for apoptosis in a manner that is restricted by MCL-1. In contrast, we have shown in published data that we cite in this manuscript that conditional deletion of Mcl-1 cerebellar granule neurons, the largest neuronal population in the brain, does not cause apoptosis (6); these data provide direct evidence that large populations of cells in the brain do not depend on MCL-1. We therefore disagree with the characterization of the brain-specific Mcl-1 deletion phenotype as “non-specific”.

      • The white matter disease is interpreted as similar to VWM; VWM is specifically investigated and MCL-1 is found to be decreased in VWM brain tissue. The decrease is most likely nonspecific. Decrease in MCL-1 is most likely part of a general mechanism of degeneration of brain tissue or white matter. That is a different but also important conclusion. It is essential that other progressive leukodystrophies and acquired brain diseases with tissue degeneration, such as encephalitis, are investigated as well to see whether MCL-1 is also decreased in these disorders. If so, the MCL-1 decrease in white matter disease and other brain degenerative disease should be described as a final common pathway rather than specifically applicable to VWM.*

      We agree that MCL-1 is likely to be a final common point in multiple disease processes that affect white matter. As described in our response to point 3 below, we are persuaded by the reviewers that the proposed similarity of the Mcl-1 deletion phenotype and VWMD is not sufficiently supported by the available evidence. We will revise the text to make clear that we consider that impaired MCL-1 is “likely part of a general mechanism of degeneration of… white matter”.

      • Adding to point 2 is the fact that the pathology of the brain-specific MCL-1 knock-out mouse does not resemble the pathology of VWM at all. The central features of VWM are abnormal astrocyte morphology with astrocytes having a few stunted processes, lack of reactive astrogliosis, lack of microgliosis, increase in number of oligodendrocytes and presence of foamy oligodendrocytes. The increase in oligodendrocytes in VWM may be such that the high cellularity leads to diffusion restriction on MRI. Bergmann glia are typically ectopic, but not reduced in number. By contrast, the brain-specific MCL-1 knock-out mouse is characterized by decreased numbers of oligodendrocytes, increased numbers of microglia, reactive astrogliosis, decreased numbers of Bergmann glia and ectopic granule cells. No morphological abnormalities of oligodendrocytes and astrocytes are observed. So, histopathologically the only shared feature is preferential involvement of the brain white matter.*

      We are persuaded by the reviewers that our assertion of a high degree of similarity between the Mcl-1 deletion phenotype and VWMD was not adequately supported by our available data. In the revision, we will state that a role for MCL-1 deficiency in VWMD pathogenesis is hypothetical, and that additional studies beyond the scope of this project will be needed to test this hypothesis. However, we reassert that the white matter specificity of the Mcl-1-deletion phenotype is important.

      The reviewer accurately characterizes the pathology of the Mcl-1 deletion phenotype and notes “the preferential involvement of the white matter”. We consider that the preferential involvement of white matter, and of oligodendrocytes within the white matter are highly significant. We will revise the work to focus on the Mcl-1 deletion phenotype, the white matter specificity, and the potential relevance to diverse white matter-specific disease.

      While we concede that more data would be needed to firmly connect MCL-1 to VWMD, we do not agree that the Mcl-1 phenotype “does not resemble the pathology of VWM at all”. There is a diversity of published observations of pathology in VWMD and not all published reports support the descriptions in the reviewer comment. This diversity of findings is highly relevant to our work. For example, while autopsy studies of humans with end stage VWMD show lack of microgliosis (7), studies of mice with a mutation known to cause VWMD in humans, that clearly recapitulate VWMD, show robust microgliosis earlier in the disease process (8). These different observations raise the possibility that microgliosis occurs during the period of active neurodegeneration or at least that in murine brain, the VWMD process activates a microglial reaction. Either interpretation would support a likeness between Mcl-1-deleted mice and VWMD mouse models. Another study of cerebellar pathology in twin human fetuses with characteristic VWMD mutations showed complete absence of Bergmann glia (9). We propose in the revision to address the reviewer’s concerns by presenting the diversity of perspectives on microglial reaction and Bergmann glial changes in VWMD, including all of the citations above.

      • The clarity of the work would benefit from a different approach to introduce the study. It would help the reader to know that (1) gray matter cell specific Mcl-1 deletion in mice did not cause apoptosis and (2) apoptosis may have different effector proteins. This important information is now in the discussion. The switch to another cell type in the brain (hGFAP+ cells) would be logical and the significance of the work may improve. When approaching the topic from the field of leukodystrophies one would not necessarily think of deleting the Mcl-1 gene, especially as this gene is not associated with any known leukodystrophy and tends to associate with preneoplastic and neoplastic disease.*

      We appreciate these suggestions, which we agree will enhance the logical flow and the significance, in line with our response to point 3. We will revise the Introduction as suggested.

      • The authors claim that the ISR is activated in VWM, which means that eIF2α phosphorylation levels are increased, general protein synthesis is decreased and a transcription pathway is regulated by ATF4 and other factors. However, this is not what is seen in VWM. Increased eIF2α phosphorylation and reduced general protein synthesis are not observed in VWM; strikingly, the level of eIF2α phosphorylation is reduced, general protein synthesis appears at a normal rate, and only the ATF4-regulated transcriptome is continuously expressed in VWM astrocytes. *

      This point is not well-settled, as published studies show that the ISR is activated in VWMD despite decreased eIF2α phosphorylation (10, 11). Published scRNA-seq studies of mice with VWMD mutations moreover, show that the ISR transcriptome is activated in oligodendrocytes, as well as neurons, endothelial cells and microglia (8). We will address this concern in the revision by citing these published reports that show both decreased eIF2α phosphorylation and lines of evidence that support ISR activation.

      Fritsh et al. show that MCL-1 protein synthesis is reduced by increased eIF2α phosphorylation due to reduced translation rates at the Mcl-1 mRNA and not due to differences in Mcl-1 mRNA levels.

      We agree with this interpretation of Fritsh et al, which is fully compatible with our proposed mechanism. We suggest that ISR activation in VWMD decreases translation of Mcl-1 mRNA, leading to reduced MCL-1 protein expression. MCL-1 protein is rapidly degraded and may therefore be a more sensitive detector of impaired translation than other readouts. We currently cite published work documenting altered translation in VWMD in the manuscript and in the revision will add the reference Moon et al, which is directly on point (11).

      One would a priori not expect to find altered MCL-1 synthesis rates in the mildly affected VWM mouse model Eif2B5R132H/R132H.

      The model does not show reduced global translation under normal conditions, but rather hypo-activity of eIF2B affects the translation of specific mRNAs (12). We will make this point clear in the revision.

      Actually, ISR deregulation has not been reported in the Eif2B5R132H/R132H VWM mouse model. The authors need to rephrase this part of their study taking this information into account, when explaining their experiments and interpreting their results.

      Consistent with the data that the ISR is activated in VWMD, mice show ATF4 up-regulation and other evidence of ISR activation (13) and impaired responses to physiologic stress (14, 15). In the revision, we will add these citations. To address the reviewer concerns, we will state in the revision that ISR activation is one of many potential mechanisms of reduced MCL-1 expression.

      The authors now imply that their study adds mechanistic insight into the VWM field and that is not the case.

      As we describe in response to point 3, we will acknowledge in the revision that the assertion that MCL-1 deficiency causes VWMD is hypothetical.

      In addition, Figure 7C shows differences in actin signal rather than MCL-1 signal, suggesting that transfer of the actin protein from the gel to the blot was not optimal for the middle lanes. MCL-1 protein may thus not be reduced in these samples from Eif2B5R132H/R132H VWM mice.

      We stand by our Western blot data that show that MCL-1 levels are lower in the Eif2B5R132H/R132H VWM mouse model, coincident with the onset of symptoms. The Western blot shown is a representative image that includes 3 biological replicates for each condition and of a total of 12 mice. The quantification demonstrates the reproducibility of the finding.

      • Can the authors show in which cell type was apoptosis found (lines 315-316)? Their study uses the hGFAP - Cre mouse model to generate conditional Mcl-1 knock-out mice. The original paper by Zhuo et al. describing the hGFAP - promoter mouse model suggests that Mcl-1 expression is also affected in neurons and ependymal cells. The authors can investigate this further to assess which cell types (1) are sensitive to apoptosis by Mcl-1 deletion and (2) depend on Bax and Bak.*

      Apoptosis may occur at different times in different cell populations, and asynchronous apoptosis can be difficult to detect at any point in time, which can complicate the suggested studies. Despite significant effort, we have not been able to co-localize any markers with dying cells in our model.

      To address the question of neuronal involvement, the revised manuscript will refer to prior published studies (16-18) which show that Mcl-1 deletion affects forebrain neural progenitors. In this context, we will discuss that our Mcl-1 deletion studies show that significant neural progenitor populations survive prenatal Mcl-1 deletion and generate appropriate cortical and hippocampal architecture in Mcl-1-deleted mice at P7, prior to the onset of white matter degeneration.

      To identify involved glial cells, we quantified the cells that were depleted or persisted in the Mcl-1 deleted brain. These studies identified oligodendrocytes and Bergmann glial as cell types depleted during P7-P15, when postnatal degeneration occurs in Mcl-1 deleted mice. In contrast, astrocytes persisted, indicating that astrocytes are not MCL-1-dependent. In the review, we will add new data quantifiying the immature, PDGFRA-expression subset of oligodendrocytes, which will increase the specification of which cells are depleted by Mcl-1 deletion.

      We share the reviewer’s interest in the question of which subsets of Mcl-1 dependent cells are rescued by co-deletion of Bax or Bak. As known markers may not be sufficient to distinguish these subsets, we consider that scRNA-seq studies are an ideal approach to identify these subsets and their specific gene expression patterns. However, these studies are outside the scope of the present work, which establishes that specific white matter cells depend on Mcl-1.

      • Heterozygous deletion of Bak greatly reduces the number of Bak-expressing cells (Fig. 3C, line, 331-333). Authors need to explain this remarkable finding. *

      As we state in the text, the reduced Bak expression in the heterozygous Bak +/- mice is consistent with a gene dosage effect, which has been observed for other genes.

      Please provide raw IHC data.

      Our IHC data is “raw” in the sense of unaltered. We are happy to include a supplementary figure with additional low power and high-power images of BAK staining.

      Co-staining with neuronal, astrocytic or oligodendrocytic markers would be insightful.

      To address this point, we have successfully performed double labeling with antibodies to BAK and with antibodies to the oligodendrocyte marker SOX10 and the astrocyte marker GFAP. We will add these images to the revision. These images show that BAK+ cells include oligodendrocytes and astrocytes. The position and morphology of the BAK+ cells show that they are not neurons.

      In addition, what does the Western blot signal for the BAX protein represent in Bax homozygous knock out mice (Fig. 3C)?

      We will add text stating that the small residual BAX protein detected in the conditional Bax-deleted mice can be attributed to BAX expression in cells outside the Gfap lineage, including endothelial cells, vascular fibroblasts, and microglia.

      Can the percentage of BAX+ cells in Mcl-1/BaxdKO corpus callosum be determined, similarly as was done for BAK? Co-staining with neuronal, astrocytic or oligodendrocytic markers would be insightful here as well. The legend of Fig. 3D does not state what staining is shown (H&E?).

      We were not able to label BAX protein in individual cells using immunohistochemistry. In contrast, BAK immunohistochemistry worked well, allowing us to analyze the cellular distribution of BAK protein. We will revise the legend in 3D to state the staining is H&E.

      • What explains the strong GFAP expression in processes of Mcl-1 KO astrocytes? Are these cells refractory to apoptosis or to hGFAP-driven Cre expression and recombination? Do they lack BAK or BAX or other apopotic-regulating protein? Or do specific factors compensate for the loss of MCL-1?*

      As we discuss in our response to point 1 above, not all cells require MCL-1 to prevent spontaneous apoptosis. The persistence of GFAP+ astrocytes in Mcl-1-deleted mice shows that astrocytes do not require MCL-1 to maintain their survival. These data do not mean that these astrocytes are refractory to apoptosis, but rather they are not primed for apoptosis in a way that is critically restricted by MCL-1. We will add a discussion of these implications to the revision.

      • Which developing symptoms do the authors refer to in line 468? Please specify and introduce appropriate references.*

      We will add a description of symptoms to the revision.

      • The definition of leukodystrophies given in the paper is outdated. Leukodystrophies are not invariably progressive and fatal disorders. For more recent definition of leukodystrophies see Vanderver et al., Case definition and classification of leukodystrophies and leukoencephalopathies, Mol Genet Metab 2015, and van der Knaap et al., Leukodystrophies a proposed classification system based on pathology, Acta Neuropathol 2017.*

      We appreciate this advice. We will revise the Introduction accordingly and cite the recommended work.

      • It is not correct that there is no specific targeted therapy clinically implemented to arrest progression of the disease in any leukodystrophy. Perhaps hematopoietic stem cell transplantation is not specific targeted, although curative if applied in time in adrenoleukodystrophy and metachromatic leukodystrophy, but certainly genetically engineered autologous hematopoietic stem cells would qualify the definition. In any case, the suggestion that no leukodystrophy is treatable is not correct.*

      We appreciate this correction. We will revise the text to provide a more detailed description of treatment options while underscoring the need for mechanistic insight.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      In this manuscript, the authors characterize the phenotype associated with brain-specific deletion of the mcl-1 gene in mice as a model for vanishing white matter-like disease in humans. Unfortunately, the gfap gene is expressed in many cell types during development which are outside of the intended cell type for this study, so functional data presented from the mutant mice is open to interpretation. The authors have not ruled out other interpretations of their results. The authors need to address major shortcomings in their data interpretation by addressing the following issues.

      We appreciate that concerns related to vision and hearing in the Mcl-1 deleted mice, and address these concerns as described below.

      On line 57, the authors indicate that seizures are common in leukodystrophy. This is controversial. Patients may have attacks that look like seizures, but without EEG recordings there is no way to distinguish these events from myoclonus. The authors should note this ambiguity.

      We will note this ambiguity in the revision. On line 58, the authors indicate the absence of treatments for leukodystrophies. The authors should review the following articles: PMID: 7582569, 15452666 and 27882623, and moderate the text.

      We will cite these papers and moderate the text as recommended

      The methods section is lacking in details in several areas. For example beginning line 136, there is virtually no indication of the MRI details without going to secondary literature. The authors should provide a brief description including magnet strength, type of imaging and the general sequence, software used to collect and analyze the images.

      We will include these details in the revision.

      Were the brains actually harvested fresh, where mechanical stresses easily deform brain structure, prior to immersion fixation for 48h? This could be troubling despite the method being previously published.

      Brains were harvested fresh and drop fixed. We have extensive experience over more than ten years in handling brain tissue from neonatal mice and subsequently analyzing MRI images and sections. These methods have allowed us to make quantitative volumetric comparisons of the 3-dimensional architecture of the developing brain using MRI in prior studies, that detected genotypic differences in brain growth without confounding fixation artefact (19). We can confirm that no mechanical stress of handling can reproduce the white matter specific changes that we see in the Mcl-1-deleted brain. We did not detect any abnormalities in control brains subjected to the same handling techniques. Beginning on line126, the authors could at least indicate the fixative details and whether the mice were perfused or tissue was immersion fixed. Compare this lack of detail with the description of lysis buffer beginning on line 158.

      We will add fixation details to the revision.

      Behavioral testing at young ages is rather problematic regarding data interpretation. For example, open field testing (Fig. 2B) at postnatal day 7, which relies on visual cues, is rather dubious when mice do not open their eyes until 12-13 days after birth. How would the pups know if they were in the middle of an open field and exhibit thigmotaxis, even if they were capable of the behavior at such a young age? Thus, the P7 data likely cannot be interpreted in terms of the knockouts being normal.

      We fully agree with the reviewer on the challenges with behavioral analysis of such young mice. The rationale for the open field test was that, at P7, mouse pups are gaining greater control of hind limb function, which can be observed as a transition from pivoting in one place to forward locomotion. Thus, we measured the number of pivots and distance traveled in the open field as indicators for maturation of motor function. Center time was presented to show that, at P7, both WT and knockout mice stayed in the middle (i.e., the groups were at the same stage of limited mobility). We consider that these measures, together with geotaxis and latency to righting (Table 1), provide a developmentally-appropriate neurologic assessment for an age when behaviors are very limited. We will make clear in the revision that these specific tests must be considered together in order to be informative.

      By P14, when the mutants exhibit a phenotype, they are already significantly underweight, which can lead to non-specific phenotypes such as retinal dysfunction or degeneration. Did the authors look for pathological changes in the retina?

      Further, GFAP is expressed in retina of many vertebrate species (PMID 1283834) which would inactivate mcl1 in that tissue and possibly lead to blindness. Indeed, the table at the following link provides a list of tissues in which the gfap-cre transgene is expressed during development. The authors need to address this major issue. http://www.informatics.jax.org/allele/MGI:2179048?recomRibbon=open

      We appreciate this suggestion and we will look for pathology in the retina and optic nerve. Such pathology, if we find it, is likely to be specific, as the optic nerve is myelinated and we have already noted extensive myelination abnormalities in the Mcl-1-deleted mice. If we find retinal or optic nerve abnormalities, we will note the potential for these abnormalities to impact on open field testing.

      For the startle response, which relies on normal hearing, did the authors check to determine if the mutants are deaf? This is very difficult at such a young age, especially prior to tight junction assembly in the lateral wall at around P14. Again, GFAP is expressed in the cochlea at an early age (see PMID 20817025) and may have caused degenerative pathology in this tissue. The authors need to address this major issue.

      The reviewer brings up the potential issue of deafness as a confounding factor for acoustic startle testing. Our results showed that startle responses in the mutant mice were increased at P14, which clearly indicates the mice were able to hear the acoustic stimuli. Further, at P14 and P21, both WT and knockout mice had orderly patterns of prepulse inhibition, providing confirmation of good hearing ability at each timepoint. We will make these points clear in the revision.

      *Reviewer #2 (Significance (Required)):

      Unknown.*

      The reviewer has not raised specific issues with the significance. We consider the significance of our work to be the finding that oligodendrocyte-lineage glial cells depend on MCL-1 and thus are primed for apoptosis, such that disrupting MCL-1 expression results in catastrophic degeneration of the cerebral white matter. Addressing the reviewer’s concerns described in the section on Evidence, reproducibility and clarity will support this significance.*

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      Cleveland et al. tried to argue that brain-specific depletion of apoptosis regulator MCL-1 reproduces Vanishing White Matter Disease (VWMD) in mice. The authors show that brain-specific MCL-1 deficiency leads to brain atrophy, increased brain cell apoptosis, decreased oligodendrocytes, decreased MBP immunoreactivity, and activation of astrocytes and microglia. It is known that VWMD is a hypomyelinating disorder caused by mutation of eIF2B subunits, which displays severe myelin loss but minimal oligodendrocyte apoptosis or loss in the CNS white matter. In fact, a number of studies show increased oligodendrocyte numbers in the CNS white matter.*

      Published reports show decreased normal oligodendrocytes and increased immature oligodendrocyte populations (20)**.

      The characteristic oligodendrocyte pathology is foamy oligodendrocytes (Wong et al., 2000), rather than apoptosis.

      Foamy oligodendrocyte pathology and increased oligodendrocyte apoptosis are not mutually exclusive. The above referenced paper, Wong et al, in addition to foamy oligodendrocytes, also describes a “decrease in numbers of cells with oligodendroglial phenotype, both normal and abnormal” (21); this decrease is compatible with increased apoptosis. Moreover, published reports specifically describe apoptotic oligodendrocytes in human brains with VWMD (22). To address this point, we propose to include both of these citations in the revision to reference foamy oligodendrocyte pathology in VWMD and to state that this pathologic finding does exclude a role for apoptosis in VWMD pathogenesis.

      Since the CNS pathology of brain-specific MCL-1 deficient mice is drive by brain cell apoptosis, the relevance of this mouse model to VWMD is very limited.

      Whether apoptosis plays a mechanistic role in VWMD is less clear than this comment suggests, as described in multiple publications (22, 23).

      The title of this manuscript is misleading, and should be changed.

      We accept that our statement that Mcl-1-deletion recapitulates VWMD is premature and not adequately supported by the available data. We will revise the title, introduction and discussion accordingly, to focus on the white matter specificity of the Mcl-1-deletion phenotype.

      *Moreover, there are a number of major concerns.**

      1. Figure 1 clearly shows severe atrophy of neocortex in Mcl-1 cKO mice; however, the white matter appears largely normal in the cerebellum and brain stem. Mcl-1 cKO mice also display ventricular dilation and possible atrophy of corpus callosum. The authors should discuss severe atrophy of neocortex in Mcl-1 cKO mice and the possibility that ventricular dilation and corpus callosum atrophy result from severe atrophy of neocortex?*

      The cortical atrophy that the reviewer notes begins after P7 and is minimal at P14 when white matter loss is already pronounced. At P21, when there is clear cortical thinning, the white matter loss is extreme. Based on the time course, we consider that the white matter loss is the primary pathology, and the cortical thinning is secondary. Importantly, glial cells populate the cortex as well as the white matter and our cellular data show that oligodendrocytes are reduced in the cortex as well as in the white matter structures. Based on these lines of evidence, we consider that the primary cell type affected is the oligodendroglial population of the glia. We will add a discussion along these lines to the revision.

      We agree that the brain stem is preserved. Our data show that the hGFAP-Cre promoter is least efficient in the brain stem and midbrain regions (Sup Fig.1). We will note this differential efficiency in the revision.

      • The motor and sensory tests in Figure 2 are potential interesting, but their relevance to myelin abnormalities is limited. The authors should perform the behaviors tests that are highly relevant to myelin abnormalities.*

      The tests presented show progressive neurologic impairment, correlating with the onset of neuropathology. In the revision we will note that ataxia and tremor are common features of leukodystrophies and the Mcl-1-deleted mice show both ataxia and tremor.

      • It is well expected that there are increased apoptotic cells in the brain of Mcl-1 cKO mice. The authors should perform double labeling to demonstrate which cell types undergo apoptosis: neurons, oligodendrocytes, or other cell types? On the other hand, Figure 3A shows that there are substantial apoptotic cells in the cerebral cortex, which is consistent with severe cerebral cortex atrophy in Mcl-1 cKO mice, suggesting neuron apoptosis in the cerebral cortex. Neuron apoptosis would further rule out the relevance of Mcl-1 cKO mice to VWMD.*

      These studies would be of interest, but we have not been able to co-label apoptotic cells in the Mcl-1-deleted mice with any marker. In the advanced state of apoptosis when dying cells are detectable by TUNEL staining, the relevant marker proteins have been degraded beyond recognition by IHC. In contrast, the apoptotic marker cleaved caspase-3, which is positive earlier in the apoptotic process and might allow marker co-labeling, was not detectably elevated in the Mcl-1-deleted mice. We attribute the lack of cleaved caspase-3+ cells to the asynchronous nature of the increased cell death, and to the short duration in which dying cells are cleaved caspase-3+. While double label studies of dying cells have been problematic, our studies quantifying each cell type provide information to address the reviewer’s question. Our cell counts show clearly that oligodendrocytes are the primary cell type reduced in number in the Mcl-1 deleted mice.

      • Figure1, 4 the authors use H&E staining to demonstrate white matter loss. H&E staining is good to show general CNS morphology; however, it is impossible to use H&E staining to quantify the integrity of the white matter. The authors should perform specific staining to quantify white matter loss in the mouse models.*

      Our MBP stains later in the paper are used to quantify white matter loss.

      • Figure 5, MBP IHC is good to show general myelin staining, but is not a reliable assay to quantify myelin integrity in the CNS. The authors should perform electron microscopy analysis to quantify myelin integrity in the CNS in the mouse models.*

      Our studies of MBP staining show that the myelinated area in cross sections is significantly reduced in the Mcl-1-deleted mice. Electron microscopy studies cannot show whether the myelinated area is reduced and studies of myelin integrity are not needed to prove that reduced oligodendrocytes correlate with reduced myelination.

      • Figure 6, SOX10 is a marker of oligodendrocytes and OPCs. The authors should quantify the number of oligodendrocytes (using oligodendrocyte markers, such as CC1) and the number of OPCs (using OPC markers, such as NG2). Does deletion of BAK or BAX reduce oligodendrocyte apoptosis in the CNS of Mcl-1 cKO mice?*

      We agree that this is an important question, and we are working to quantify OPCs in the Mcl-1-deleted mice by counting cells labelled with the OPC marker PDGFRA. We will add these data to the revision and discuss their significance when we know what they show.

      • The authors show that the level of MCL-1 is comparable in brain lysates of wildtype and eIF2B5 R132H/R132H mice at the age of 7 months, and moderately decreased in eIF2B5 R132H/R132H mice at the age of 10 months. VWMDis a developmental disorder. Similarly, brain-specific MCL-1 deficiency causes developmental abnormalities in the CNS. The normal level of MCL-1 in 7-month-old eIF2B5 R132H/R132H mice strongly suggests that MCL-1 is not a major player involved in the pathogenesis of VWMD. Does brain-specific MCL-1 deficiency starting at the age of 10 months (using CreERT mice) cause CNS abnormalities in adult mice?*

      We agree that Mcl-1 deletion in our model disrupts postnatal brain development. Our studies show that in early life, oligodendrocytes depend on MCL-1 to prevent spontaneous apoptosis. It is an interesting, but separate question whether Mcl-1 deletion induced in the adult would also cause a similar phenotype. The suggested studies would take over a year to conduct, and while they are of interest, they are not required to prove our main point, which is that developmental leukodystrophies may result from the dependence of oligodendrocytes on MCL-1. In the revision, we will state that our comparison on the Mcl-1-deletion phenotype to VWMD is hypothetical, and that additional studies are needed to test this hypothesis.

      • Does MCL-1 deletion exacerbate the pathology in eIF2B5 R132H/R132H mice? Moreover, does MCL-1 overexpression rescue the pathology in eIF2B5 R132H/R132H mice? These two experiments are necessary to demonstrate the involvement of MCL-1 in VWMDpathogenesis.*

      We agree that these are interesting and important studies; however, these studies will require years to complete and extensive resources. These studies are not needed to show that Mcl-1 deletion produces early onset white matter degeneration, which is our main point. As in our response to point 7 above, we will state in the revision that our comparison on the Mcl-1-deletion phenotype to VWMD is hypothetical, and list these experiments as follow up studies that are needed to test this hypothesis.

      *Reviewer #3 (Significance (Required)):

      The study will not significantly advance the understanding of VWMD pathogenesis.*

      We recognize that our assertion of a direct relevance to VWMD was premature, and that additional studies, beyond the scope to this project, are needed to determine if MCL-1 deficiency contributes to VWMD pathology. We agree that the available data do not yet inform VWMD pathogenesis, but these data may become relevant to VWMD as follow-up studies are conducted. The data remain highly relevant to the broad group of leukodystrophies as they demonstrate a physiologic vulnerability of oligodendrocytes that sets them apart from astrocytes and neurons, and thus may play a role in disorders in which oligodendrocyte pathology is central.

      Neuroscientists may be interested in the reported findings.

      We appreciate the reviewer noting the significance for neuroscience.

      My field of expertise: oligodendrocyte, myelin, neurodegeneration, ER stress

      References cited:

      1. K. A. Sarosiek, C. Fraser, N. Muthalagu, P. D. Bhola, W. Chang, S. K. McBrayer, A. Cantlon, S. Fisch, G. Golomb-Mello, J. A. Ryan, J. Deng, B. Jian, C. Corbett, M. Goldenberg, J. R. Madsen, R. Liao, D. Walsh, J. Sedivy, D. J. Murphy, D. R. Carrasco, S. Robinson, J. Moslehi, A. Letai, Developmental Regulation of Mitochondrial Apoptosis by c-Myc Governs Age- and Tissue-Specific Sensitivity to Cancer Therapeutics. Cancer Cell 31, 142-156 (2017).
      2. R. Dumitru, V. Gama, B. M. Fagan, J. J. Bower, V. Swahari, L. H. Pevny, M. Deshmukh, Human Embryonic Stem Cells Have Constitutively Active Bax at the Golgi and Are Primed to Undergo Rapid Apoptosis. Mol Cell 46, 573-583 (2012).
      3. T. Ni Chonghaile, K. A. Sarosiek, T. T. Vo, J. A. Ryan, A. Tammareddi, G. Moore Vdel, J. Deng, K. C. Anderson, P. Richardson, Y. T. Tai, C. S. Mitsiades, U. A. Matulonis, R. Drapkin, R. Stone, D. J. Deangelo, D. J. McConkey, S. E. Sallan, L. Silverman, M. S. Hirsch, D. R. Carrasco, A. Letai, Pretreatment mitochondrial priming correlates with clinical response to cytotoxic chemotherapy. Science 334, 1129-1133 (2011).
      4. J. A. Ryan, J. K. Brunelle, A. Letai, Heightened mitochondrial priming is the basis for apoptotic hypersensitivity of CD4+ CD8+ thymocytes. Proc Natl Acad Sci U S A 107, 12895-12900 (2010).
      5. M. Certo, V. D. G. Moore, M. Nishino, G. Wei, S. Korsmeyer, S. A. Armstrong, A. Letai, Mitochondria primed by death signals determine cellular addiction to antiapoptotic BCL-2 family members. Cancer Cell 9, 351-365 (2006).
      6. A. J. Crowther, V. Gama, A. Bevilacqua, S. X. Chang, H. Yuan, M. Deshmukh, T. R. Gershon, Tonic activation of Bax primes neural progenitors for rapid apoptosis through a mechanism preserved in medulloblastoma. The Journal of neuroscience : the official journal of the Society for Neuroscience 33, 18098-18108 (2013).
      7. D. Rodriguez, A. Gelot, B. della Gaspera, O. Robain, G. Ponsot, L. L. Sarlieve, S. Ghandour, A. Pompidou, A. Dautigny, P. Aubourg, D. Pham-Dinh, Increased density of oligodendrocytes in childhood ataxia with diffuse central hypomyelination (CACH) syndrome: neuropathological and biochemical study of two cases. Acta Neuropathol 97, 469-480 (1999).
      8. Y. L. Wong, L. LeBon, A. M. Basso, K. L. Kohlhaas, A. L. Nikkel, H. M. Robb, D. L. Donnelly-Roberts, J. Prakash, A. M. Swensen, N. D. Rubinstein, S. Krishnan, F. E. McAllister, N. V. Haste, J. J. O'Brien, M. Roy, A. Ireland, J. M. Frost, L. Shi, S. Riedmaier, K. Martin, M. J. Dart, C. Sidrauski, eIF2B activator prevents neurological defects caused by a chronic integrated stress response. Elife 8, (2019).
      9. A. Trimouille, F. Marguet, F. Sauvestre, E. Lasseaux, F. Pelluard, M. L. Martin-Negrier, C. Plaisant, C. Rooryck, D. Lacombe, B. Arveiler, O. Boespflug-Tanguy, S. Naudion, A. Laquerriere, Foetal onset of EIF2B related disorder in two siblings: cerebellar hypoplasia with absent Bergmann glia and severe hypomyelination. Acta Neuropathol Commun 8, 48 (2020).
      10. T. E. M. Abbink, L. E. Wisse, E. Jaku, M. J. Thiecke, D. Voltolini-Gonzalez, H. Fritsen, S. Bobeldijk, T. J. Ter Braak, E. Polder, N. L. Postma, M. Bugiani, E. A. Struijs, M. Verheijen, N. Straat, S. van der Sluis, A. A. M. Thomas, D. Molenaar, M. S. van der Knaap, Vanishing white matter: deregulated integrated stress response as therapy target. Ann Clin Transl Neurol 6, 1407-1422 (2019).
      11. S. L. Moon, R. Parker, EIF2B2 mutations in vanishing white matter disease hypersuppress translation and delay recovery during the integrated stress response. RNA 24, 841-852 (2018).
      12. G. Raini, R. Sharet, M. Herrero, A. Atzmon, A. Shenoy, T. Geiger, O. Elroy-Stein, Mutant eIF2B leads to impaired mitochondrial oxidative phosphorylation in vanishing white matter disease. J Neurochem 141, 694-707 (2017).
      13. L. Kantor, D. Pinchasi, M. Mintz, Y. Hathout, A. Vanderver, O. Elroy-Stein, A point mutation in translation initiation factor 2B leads to a continuous hyper stress state in oligodendroglial-derived cells. PLoS One 3, e3783 (2008).
      14. Y. Cabilly, M. Barbi, M. Geva, L. Marom, D. Chetrit, M. Ehrlich, O. Elroy-Stein, Poor cerebral inflammatory response in eIF2B knock-in mice: implications for the aetiology of vanishing white matter disease. PLoS One 7, e46715 (2012).
      15. L. Marom, I. Ulitsky, Y. Cabilly, R. Shamir, O. Elroy-Stein, A point mutation in translation initiation factor eIF2B leads to function--and time-specific changes in brain gene expression. PLoS One 6, e26992 (2011).
      16. L. C. Fogarty, R. T. Flemmer, B. A. Geizer, M. Licursi, A. Karunanithy, J. T. Opferman, K. Hirasawa, J. L. Vanderluit, Mcl-1 and Bcl-xL are essential for survival of the developing nervous system. Cell Death Differ 26, 1501-1515 (2019).
      17. S. M. Hasan, A. D. Sheen, A. M. Power, L. M. Langevin, J. Xiong, M. Furlong, K. Day, C. Schuurmans, J. T. Opferman, J. L. Vanderluit, Mcl1 regulates the terminal mitosis of neural precursor cells in the mammalian brain through p27Kip1. Development 140, 3118-3127 (2013).
      18. C. D. Malone, S. M. Hasan, R. B. Roome, J. Xiong, M. Furlong, J. T. Opferman, J. L. Vanderluit, Mcl-1 regulates the survival of adult neural precursor cells. Mol Cell Neurosci 49, 439-447 (2012).
      19. S. E. Williams, I. Garcia, A. J. Crowther, S. Li, A. Stewart, H. Liu, K. J. Lough, S. O'Neill, K. Veleta, E. A. Oyarzabal, J. R. Merrill, Y. I. Shih, T. R. Gershon, Aspm sustains postnatal cerebellar neurogenesis and medulloblastoma growth. Development, (2015).
      20. M. Bugiani, I. Boor, B. van Kollenburg, N. Postma, E. Polder, C. van Berkel, R. E. van Kesteren, M. S. Windrem, E. M. Hol, G. C. Scheper, S. A. Goldman, M. S. van der Knaap, Defective glial maturation in vanishing white matter disease. J Neuropathol Exp Neurol 70, 69-82 (2011).
      21. K. Wong, R. C. Armstrong, K. A. Gyure, A. L. Morrison, D. Rodriguez, R. Matalon, A. B. Johnson, R. Wollmann, E. Gilbert, T. Q. Le, C. A. Bradley, K. Crutchfield, R. Schiffmann, Foamy cells with oligodendroglial phenotype in childhood ataxia with diffuse central nervous system hypomyelination syndrome. Acta Neuropathol 100, 635-646 (2000).
      22. K. Van Haren, J. P. van der Voorn, D. R. Peterson, M. S. van der Knaap, J. M. Powers, The life and death of oligodendrocytes in vanishing white matter disease. J Neuropathol Exp Neurol 63, 618-630 (2004).
      23. M. Bugiani, I. Boor, J. M. Powers, G. C. Scheper, M. S. van der Knaap, Leukoencephalopathy with vanishing white matter: a review. J Neuropathol Exp Neurol 69, 987-996 (2010).
    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Cleveland et al. tried to argue that brain-specific depletion of apoptosis regulator MCL-1 reproduces Vanishing White Matter Disease (VWMD) in mice. The authors show that brain-specific MCL-1 deficiency leads to brain atrophy, increased brain cell apoptosis, decreased oligodendrocytes, decreased MBP immunoreactivity, and activation of astrocytes and microglia. It is known that VWMD is a hypomyelinating disorder caused by mutation of eIF2B subunits, which displays severe myelin loss but minimal oligodendrocyte apoptosis or loss in the CNS white matter. In fact, a number of studies show increased oligodendrocyte numbers in the CNS white matter. The characteristic oligodendrocyte pathology is foamy oligodendrocytes (Wong et al., 2000), rather than apoptosis. Since the CNS pathology of brain-specific MCL-1 deficient mice is drive by brain cell apoptosis, the relevance of this mouse model to VWMD is very limited. The title of this manuscript is misleading, and should be changed. Moreover, there are a number of major concerns.

      1. Figure 1 clearly shows severe atrophy of neocortex in Mcl-1 cKO mice; however, the white matter appears largely normal in the cerebellum and brain stem. Mcl-1 cKO mice also display ventricular dilation and possible atrophy of corpus callosum. The authors should discuss severe atrophy of neocortex in Mcl-1 cKO mice and the possibility that ventricular dilation and corpus callosum atrophy result from severe atrophy of neocortex?
      2. The motor and sensory tests in Figure 2 are potential interesting, but their relevance to myelin abnormalities is limited. The authors should perform the behaviors tests that are highly relevant to myelin abnormalities.
      3. It is well expected that there are increased apoptotic cells in the brain of Mcl-1 cKO mice. The authors should perform double labeling to demonstrate which cell types undergo apoptosis: neurons, oligodendrocytes, or other cell types? On the other hand, Figure 3A shows that there are substantial apoptotic cells in the cerebral cortex, which is consistent with severe cerebral cortex atrophy in Mcl-1 cKO mice, suggesting neuron apoptosis in the cerebral cortex. Neuron apoptosis would further rule out the relevance of Mcl-1 cKO mice to VWMD.
      4. Figure1, 4 the authors use H&E staining to demonstrate white matter loss. H&E staining is good to show general CNS morphology; however, it is impossible to use H&E staining to quantify the integrity of the white matter. The authors should perform specific staining to quantify white matter loss in the mouse models.
      5. Figure 5, MBP IHC is good to show general myelin staining, but is not a reliable assay to quantify myelin integrity in the CNS. The authors should perform electron microscopy analysis to quantify myelin integrity in the CNS in the mouse models.
      6. Figure 6, SOX10 is a marker of oligodendrocytes and OPCs. The authors should quantify the number of oligodendrocytes (using oligodendrocyte markers, such as CC1) and the number of OPCs (using OPC markers, such as NG2). Does deletion of BAK or BAX reduce oligodendrocyte apoptosis in the CNS of Mcl-1 cKO mice?
      7. The authors show that the level of MCL-1 is comparable in brain lysates of wildtype and eIF2B5 R132H/R132H mice at the age of 7 months, and moderately decreased in eIF2B5 R132H/R132H mice at the age of 10 months. VWMD is a developmental disorder. Similarly, brain-specific MCL-1 deficiency causes developmental abnormalities in the CNS. The normal level of MCL-1 in 7-month-old eIF2B5 R132H/R132H mice strongly suggests that MCL-1 is not a major player involved in the pathogenesis of VWMD. Does brain-specific MCL-1 deficiency starting at the age of 10 months (using CreERT mice) cause CNS abnormalities in adult mice?
      8. Does MCL-1 deletion exacerbate the pathology in eIF2B5 R132H/R132H mice? Moreover, does MCL-1 overexpression rescue the pathology in eIF2B5 R132H/R132H mice? These two experiments are necessary to demonstrate the involvement of MCL-1 in VWMD pathogenesis.

      Significance

      The study will not significantly advance the understanding of VWMD pathogenesis.

      Neuroscientists may be interested in the reported findings.

      My field of expertise: oligodendrocyte, myelin, neurodegeneration, ER stress

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      In this manuscript, the authors characterize the phenotype associated with brain-specific deletion of the mcl-1 gene in mice as a model for vanishing white matter-like disease in humans. Unfortunately, the gfap gene is expressed in many cell types during development which are outside of the intended cell type for this study, so functional data presented from the mutant mice is open to interpretation. The authors have not ruled out other interpretations of their results. The authors need to address major shortcomings in their data interpretation by addressing the following issues.

      On line 57, the authors indicate that seizures are common in leukodystrophy. This is controversial. Patients may have attacks that look like seizures, but without EEG recordings there is no way to distinguish these events from myoclonus. The authors should note this ambiguity.

      On line 58, the authors indicate the absence of treatments for leukodystrophies. The authors should review the following articles: PMID: 7582569, 15452666 and 27882623, and moderate the text.

      The methods section is lacking in details in several areas. For example beginning line 136, there is virtually no indication of the MRI details without going to secondary literature. The authors should provide a brief description including magnet strength, type of imaging and the general sequence, software used to collect and analyze the images. Were the brains actually harvested fresh, where mechanical stresses easily deform brain structure, prior to immersion fixation for 48h? This could be troubling despite the method being previously published.

      Beginning on line126, the authors could at least indicate the fixative details and whether the mice were perfused or tissue was immersion fixed. Compare this lack of detail with the description of lysis buffer beginning on line 158.

      Behavioral testing at young ages is rather problematic regarding data interpretation. For example, open field testing (Fig. 2B) at postnatal day 7, which relies on visual cues, is rather dubious when mice do not open their eyes until 12-13 days after birth. How would the pups know if they were in the middle of an open field and exhibit thigmotaxis, even if they were capable of the behavior at such a young age? Thus, the P7 data likely cannot be interpreted in terms of the knockouts being normal. By P14, when the mutants exhibit a phenotype, they are already significantly underweight, which can lead to non specific phenotypes such as retinal dysfunction or degeneration. Did the authors look for pathological changes in the retina?

      Further, GFAP is expressed in retina of many vertebrate species (PMID 1283834) which would inactivate mcl1 in that tissue and possibly lead to blindness. Indeed, the table at the following link provides a list of tissues in which the gfap-cre transgene is expressed during development. The authors need to address this major issue. http://www.informatics.jax.org/allele/MGI:2179048?recomRibbon=open

      For the startle response, which relies on normal hearing, did the authors check to determine if the mutants are deaf? This is very difficult at such a young age, especially prior to tight junction assembly in the lateral wall at around P14. Again, GFAP is expressed in the cochlea at an early age (see PMID 20817025) and may have caused degenerative pathology in this tissue. The authors need to address this major issue.

      Significance

      Unknown.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Comments

      1. MCL-1 promotes the survival of different cell lineages through its ability to inhibit the pro-apoptotic proteins BAK and BAX, the main effectors of cell death in mammalian cells. By depleting brain cells of MCL-1, apoptosis is promoted in these cells, as confirmed by histopathology of the mouse brain. This is, however, a nonspecific process. Depletion of MCL-1 in any tissue would promote apoptosis in cells of this tissue and general knock-out is known to cause embryonic lethality. So, it is no surprise that knock out of MCL-1 in brain cells leads to a brain disease.
      2. The white matter disease is interpreted as similar to VWM; VWM is specifically investigated and MCL-1 is found to be decreased in VWM brain tissue. The decrease is most likely nonspecific. Decrease in MCL-1 is most likely part of a general mechanism of degeneration of brain tissue or white matter. That is a different but also important conclusion. It is essential that other progressive leukodystrophies and acquired brain diseases with tissue degeneration, such as encephalitis, are investigated as well to see whether MCL-1 is also decreased in these disorders. If so, the MCL-1 decrease in white matter disease and other brain degenerative disease should be described as a final common pathway rather than specifically applicable to VWM.
      3. Adding to point 2 is the fact that the pathology of the brain-specific MCL-1 knock-out mouse does not resemble the pathology of VWM at all. The central features of VWM are abnormal astrocyte morphology with astrocytes having a few stunted processes, lack of reactive astrogliosis, lack of microgliosis, increase in number of oligodendrocytes and presence of foamy oligodendrocytes. The increase in oligodendrocytes in VWM may be such that the high cellularity leads to diffusion restriction on MRI. Bergmann glia are typically ectopic, but not reduced in number. By contrast, the brain-specific MCL-1 knock-out mouse is characterized by decreased numbers of oligodendrocytes, increased numbers of microglia, reactive astrogliosis, decreased numbers of Bergmann glia and ectopic granule cells. No morphological abnormalities of oligodendrocytes and astrocytes are observed. So, histopathologically the only shared feature is preferential involvement of the brain white matter.
      4. The clarity of the work would benefit from a different approach to introduce the study. It would help the reader to know that (1) gray matter cell specific Mcl-1 deletion in mice did not cause apoptosis and (2) apoptosis may have different effector proteins. This important information is now in the discussion. The switch to another cell type in the brain (hGFAP+ cells) would be logical and the significance of the work may improve. When approaching the topic from the field of leukodystrophies one would not necessarily think of deleting the Mcl-1 gene, especially as this gene is not associated with any known leukodystrophy and tends to associate with preneoplastic and neoplastic disease.
      5. The authors claim that the ISR is activated in VWM, which means that eIF2α phosphorylation levels are increased, general protein synthesis is decreased and a transcription pathway is regulated by ATF4 and other factors. However, this is not what is seen in VWM. Increased eIF2α phosphorylation and reduced general protein synthesis are not observed in VWM; strikingly, the level of eIF2α phosphorylation is reduced, general protein synthesis appears at a normal rate, and only the ATF4-regulated transcriptome is continuously expressed in VWM astrocytes. Fritsh et al. show that MCL-1 protein synthesis is reduced by increased eIF2α phosphorylation due to reduced translation rates at the Mcl-1 mRNA and not due to differences in Mcl-1 mRNA levels. One would a priori not expect to find altered MCL-1 synthesis rates in the mildly affected VWM mouse model Eif2B5R132H/R132H. Actually, ISR deregulation has not been reported in the Eif2B5R132H/R132H VWM mouse model. The authors need to rephrase this part of their study taking this information into account, when explaining their experiments and interpreting their results. The authors now imply that their study adds mechanistic insight into the VWM field and that is not the case. In addition, Figure 7C shows differences in actin signal rather than MCL-1 signal, suggesting that transfer of the actin protein from the gel to the blot was not optimal for the middle lanes. MCL-1 protein may thus not be reduced in these samples from Eif2B5R132H/R132H VWM mice.
      6. Can the authors show in which cell type was apoptosis found (lines 315-316)? Their study uses the hGFAP - Cre mouse model to generate conditional Mcl-1 knock-out mice. The original paper by Zhuo et al. describing the hGFAP - promoter mouse model suggests that Mcl-1 expression is also affected in neurons and ependymal cells. The authors can investigate this further to assess which cell types (1) are sensitive to apoptosis by Mcl-1 deletion and (2) depend on Bax and Bak.
      7. Heterozygous deletion of Bak greatly reduces the number of Bak-expressing cells (Fig. 3C, line, 331-333). Authors need to explain this remarkable finding. Please provide raw IHC data. Co-staining with neuronal, astrocytic or oligodendrocytic markers would be insightful. In addition, what does the Western blot signal for the BAX protein represent in Bax homozygous knock out mice (Fig. 3C)? Can the percentage of BAX+ cells in Mcl-1/BaxdKO corpus callosum be determined, similarly as was done for BAK? Co-staining with neuronal, astrocytic or oligodendrocytic markers would be insightful here as well. The legend of Fig. 3D does not state what staining is shown (H&E?).
      8. What explains the strong GFAP expression in processes of Mcl-1 KO astrocytes? Are these cells refractory to apoptosis or to hGFAP-driven Cre expression and recombination? Do they lack BAK or BAX or other apopotic-regulating protein? Or do specific factors compensate for the loss of MCL-1?
      9. Which developing symptoms do the authors refer to in line 468? Please specify and introduce appropriate references.
      10. The definition of leukodystrophies given in the paper is outdated. Leukodystrophies are not invariably progressive and fatal disorders. For more recent definition of leukodystrophies see Vanderver et al., Case definition and classification of leukodystrophies and leukoencephalopathies, Mol Genet Metab 2015, and van der Knaap et al., Leukodystrophies a proposed classification system based on pathology, Acta Neuropathol 2017.
      11. It is not correct that there is no specific targeted therapy clinically implemented to arrest progression of the disease in any leukodystrophy. Perhaps hematopoietic stem cell transplantation is not specific targeted, although curative if applied in time in adrenoleukodystrophy and metachromatic leukodystrophy, but certainly genetically engineered autologous hematopoietic stem cells would qualify the definition. In any case, the suggestion that no leukodystrophy is treatable is not correct.

      Significance

      see above

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      The authors aimed to understand the control and the elimination of disseminated tumor cells by NK cells within the lung, their main question being how pulmonary NK cells are able to prevent tumor cells from colonization in the lung.

      To dissect this question, Hiroshi Ichise and colleagues took advantage of the ultra-sensitive bioluminescence whole body imaging system combined with intravital two-photon microscopy technology involving genetically-encoded biosensors tumor or NK cells to explore the behavior and functional competences of NK cells in an experimental lung metastasis model.

      First, the authors have monitored the fate of intravenously injected B16-Akaluc cells from 5 min to 10 days and observe that tumor cells decrease rapidly within the first 12-24 hours. In parallel, they performed asialoGM1+ and NK1.1+ cells depletion by injection of depleting anti-aGM1 and anti-NK1.1 antibodies in order to see the involvement of these populations on the elimination of the disseminated tumor cells. They conclude that a rapid decrease of the tumor cells is mediated by NK cells. Consisting with this first data, the authors observe also the same early NK cells mediated impact on two other syngenic mouse tumor cell lines : the BRAFV600E melanoma and the colon adenocarcinoma MC-38.

      In a second part, the authors dissected NK cell dynamic behaviors in the pulmonary capillaries by taking advantage of the NKp46iCRExrosa26dtTomato mice where NKp46+ cells are fluorescents and performed 2P intravital imaging to follow the in situ the NKp46+ cells behavior. They could nicely observe that NK cells arrive from the capillaries and patrol on the lung epithelial cells in a stall-crawl-jump manner. Moreover, they also show that the attachment to the pulmonary capillaries is mediated by LFA-1. In the presence of B16F10 tumor model, they observe that NK cells stay longer in the capillaries and increase their duration time of crawling indicating that NK cells stay in contact longer with tumor cells.

      The authors then explored the NK-mediated tumor killing in the lung by measuring tumor cell apoptosis using B16F10-SCAT3 cells (which leads to visualize caspase 3 activation) and Ca2+ influx in tumor cells expressing two Ca2+ sensors, GCaMP6s and R-GECO. They could observe casp3 activation but also Ca+ influx on tumor cells within few minutes after encountering NK cells. They also observe that evasion of NK cell surveillance is mediated by Nectin-5 and Nectin-2 expressed on tumor cells.

      Then, they focus on NK cell activation by looking at ERK activation. To do so, they have isolated NK cells from Tg mice expressing a FRET-based ERK biosensor and performed in vitro killing assay against B16-R-GECO tumor cells but also in vivo experiments. For the in vivo experiments, they have developed reporter mice whose NK cells express the FRET biosensor for ERK. They observe that ERK-dependent NK cell activation contributes to the elimination of disseminated tumor cells within the first few hours but not after 24hours. Indeed, theu observe that B16F10-Akaluc tumor cells are equally eliminated when injected 24h after a first injection of B16F10 or PBS in mice. The authors concluded that tumor cell acquire the capacity to evade NK cell surveillance after 24h rather than a hypothesis toward NK cells loose tumoricidal activity over time.

      Finally, the authors have explored their last result on the potential tumor cell evasion of the NK cell surveillance. They show that this NK cell evasion is mediated by the shedding of cell surface Necl-5. They next show that clivage of extracellular domain of Necl-5 was mediated by thrombin in vitro and that anti-coagulation factors such as Warfarin, Edoxaban or Dabigatran Etexilate promote tumor elimination as observed by the bioluminescence experiments. This loss prevents the NK cell signaling needed for effective killing of tumor targets.

      However, most of the results remain correlations and have not been formally demonstrated or miss controls.

      B16F10 is a well known and characterized NK cell target in a in vivo model so the first part is not really knew except the in situ behavior of NK cells within the lung capillaries. The new mecanism of thrombin-mediated shedding of Necl-5 causing evasion from NK Cell surveillance is really concentrated on the last figure (Fig N{degree sign}6) and some supplemental experiments are mandatory and needed to really confirm this affirmation.

      Response: We deeply appreciate the reviewer’s effort to evaluate our work. The reviewer criticizes that the mechanism is well known except “the in situ behavior of NK cells within the lung capillaries.” Indeed, this is what we wish to emphasize in our work. Nobody has ever seen how NK cells kill metastatic tumor cells in the lung. There is a big GAP between in vitro tissue culture experiments and in vivo macroscopic counting of metastatic nodules. Most researchers do not even know when and where in the lung NK cells kill metastatic tumor cells. Live imaging is a powerful approach to address such questions.

      Reviewer #1 (Significance (Required)):

      There are several points to address to improve the significance of these data.

      \*Major points***

      1) A global point : 3 mice/group is to small to analyse and interprete data because of the heterogeneity of the mice. Mean +/- SEM have to represented instead of SD.

      Response: For the sake of animal welfare, researchers are asked to use minimal number of mice. Moreover, only one mouse can be observed in each imaging session, which takes several hours. In most experiments we performed two independent experiments with three mice each. We believe, the number is appropriate for this type of experiment. In the case of small number of samples, we think SD is better than SEM.

      2) The authors used the well known polyclonal anti-asialoGM1 Ab to deplete NK cells. AsialoGM1 is also expressed by ILC1, T, NKT and gd+T cells but also basophils (Trambley J et al., Asialo GM1(+) CD8(+) T cells play a critical role in costimulation blockade-resistant allograft rejection. JCI, 1999). The authors checked the involvement only for the basophils. They have to check the depletion of each of these populations specifically in the lung to assume that the depletion impact only the NK cells or they must change their conclusion on the entire manuscrit and say that not only NK cells is responsible and involved in the control of the disseminated tumor cells but maybe also ILC1, NKT and or gd+T cells.

      Response: We obtained similar observations by using BALB/c nu/nu mice, which lack T cells. Therefore, we can exclude the contribution of T cells at least in the acute phase (*3) Lines 133 to 136 : The authors say that they « did not observe any significant difference in the relative increase of the bioluminescence signal between the control and αAGM1-treated mice, implying that NK cells eliminate disseminated melanoma cells primarily in the acute phase (Response: After 24 hrs, the slope of increment of bioluminescence intensity (BLI) did not change significantly betweenαAGM1-treated mice and control mice. In both mice, the doubling times of melanoma cells are approximately one day.

      4) Fig S3A-B : The authors say that basophils express aGM1 so they performed basophils involvement on the elimination of B16F10 tumor cells with depleting aCD200R3 mab. They also checked the involvement of neutrophils and monocytes. They observed that basophils, neutrophils and monocytes are not involved on the B16F10 elimination. But what is the hypohesis to assess the role of neutrophils and monocytes ? Moreover, they did not explore Basophil roles in the other models including MC-38, BRAFV600E and 4T1 tumor cells.

      Response: We depleted neutrophils and monocytes because antibody-mediated removal of leukocytes could have non-specifically increased the survival of tumor cells. As for expanding the number of experiments with different cell lines, we are afraid but it is too much burden, considering the period required for the experiments and animal welfare.

      5a) Fig 1D : Missing control : the author must add the WT Balbc + a-AGM1 as control.

      Response: We have this data, which will be included in the revised paper.

      5b) Lines 154 to 156 : the authors say that « T cell immunity does not contribute to tumor cell reduction » because tumor cells are eliminated in the nu/nu mice as efficiently as in the WT Balbc mice. This is not correct because they are looking in a window that correspond to innate immunity activation (up to 24h) so they cannot talk about T cell immunity, the adpative response will come more later around 8 days after.

      Response: Yes, we are focusing on the early phase of the rejection of metastatic tumor cells. We will rephrase the sentences.

      6) Line 159 : (refer to point #2) To affirm that NK cells is critical and involved in the elimination of the disseminated tumor, authors have to perform experiment in a model of NK cell deficiency. The most relevant nowaday is the NKp46ICRExrosa26DTA mice that are deficients in NK cells but also ILC1 cells. Indeed, the authors have used the NKp46iCre mice model for other questions.

      Response: As the reviewer stated, the contribution of NK cells in the rejection of metastatic tumors is very well known. We do not think we need to repeat the experiments by using other genetically modified mouse lines, which will take at least one year. We wish to emphasize again that the new findings of our paper are in the in vivo imaging.

      7a) Fig 2F : IC missing

      Response: According to the reviewer's suggestion, we will perform control experiments with an isotype control.

      7b) Lines 181-182 : Authors conclude that the effect of anti-LFA-1 on NK cells adhesion to the pulmonary endothelial cells is mediated primarily by LFA-1. It is not totally true because it is partially mediated as observed in the fig 2F. So authors should change their conclusion and precise that the involvement is partially mediated by LFA-1.

      Response: We will rephrase the result section in the revised paper.

      8) Fig S5B-C-D and S7: The authors talk about tumor cell death. But they are analyzing Ca2+ influx in vitro so it is a little bit different from the cell death. I'm wondering how the cell death is mesured espacially in the fig S5D and S7?

      Response: Under microscopes, apoptosis can be easily recognized by the appearance of blebs. We will include videos in the revised paper.

      9) Fig 4H and lines 232-233 : the authors conclude that « damage to tumor cells is dependent on the engagement of DNAM-1 on NK cells ». There is any experiment performed to affirm this point so the authors cannot maintain this conclusion. First, the authors only analyzed Ca2+ influx at a specific time point. So this result only show that Nectin-5 and/or Nectin-2 expressed by B16F10 is involved in the Ca2+ influx following NK cell contact but there is any data on DNAM-1 contribution. So, the role on the NK cells and specifically DNAM-1+ NK cells have not been adressed here. To answer to that question, the author have to perform in vivo model of engrafted WT vs Necl-5/2 ko B16F10 in a WT vs DNAM1 deficient NK cells mouse model to ascertain the contribution of Necl-5/2-DNAM-1 on NK cells. Moreover, survival curve and bioluminescent experiments would be very appreciated.

      Response: We have shown the data with Necl-5/Nectin-2-deficientB16F10 cells in Fig. S7. I understand the importance of the experiment with the DNAM-1-deficient mice. But the introduction of another knockout mouse line cannot be performed easily. Instead, we will tone down the conclusion on the requirement of signaling from Necl-5/Nectin-2 to DNAM-1.

      10) Lines 253-254 : the authors talk about tumor apoptosis but they are looking at Ca2+ influx. So, they should change their conclusion or show killing experiment.

      Response: In Figure S7, we have shown that the sustained Ca2+ influx is a useful surrogate marker for apoptosis. We will include this information explicitly in the revised paper.

      11) Fig 6 : the authors conclude that the trombin dependent shedding of Necl-5 causes evasion of NK cells surveillance. Moreover, all experiments are correlations and do not implicate in the same experiment Necl-5, DNAM-1+ NK cells and trombin or anti-coagulation factors. So, as in the comment #9, to adress this point, the authors should inject WT vs Necl-5 deficient B16F10-Akaluc into WT vs NK cell depleted mice and monitor the bioluminescence of the tumor cells within 24h following injection of anti coagulation factors as in the fig 6H. Moreover, the monitoring of the survival curve and the number of the lung metastasis would be also very important and informative to really answer to this point.

      Response: We will try the requested experiments during revision.

      \*Minor points***

      1) Fig 2E: The authors assess the involvement of LFA-1 and MAC-1 on the NK cells attachement to the the pulmonary endothelial cells. But there is other adhesion molecules that are known to be expressed by NK cells as for example CR4 (CD11c/CD18). So, the attachement of NK cells could be also due to this molecule.

      Response: We agree. The text will be modified to suggest the involvement of other adhesion molecules.

      2) Lines 190 to 197 : Authors should put this methodology part in the « material and method » in order to be more clear on the message they want to deliver.

      Response: We will modify the text according to the suggestion.

      3) line 228 : There is any hypothesis or explanation regarding the use of Necl5/Necl2 deficient B16F10. Why authors decided to go and explore this pathway ? Authors could add some transition sentence and explanation to help readers.

      Response: We will refer to previous papers suggesting the role of DNAM-1 and its ligands, Necl-5 and nectin-2.

      4) The author could performed the same experiment as in Fig S7D and assessed ERK activation of DNAM+ vs - NK cells against WT vs Necl-5/Necl-2KO R-GEKO B16F10 cells.

      Response: We will try the suggested experiments.

      5) Line 283 : Thanks to reformulate the sentence. Check the firgures associated with the text.

      Response: We will correct this error. The figures will be Fig. 5E and 5F.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      The authors use in vivo imaging techniques to investigate the killing of lung metastasis by NK cells. They demonstrate that the cleavage of CD155 may result in resistance of killing by NK cells and suggest that this could be an immune evasion mechanism of metastatic tumor cells.

      Overall, the subject is highly relevant, and the in vivo imaging is an interesting and highly relevant technique. However, the message, that tumor cells escape the killing by NK cells by cleavage of CD155 is interesting, but not yet fully supported by the data.

      \*Major comments:***

        • Figure 6: To support their main claim the authors would need to transfect the tumor cells with a CD155 mutant, which cannot be cleaved by Thrombin and show that these tumor cells can no longer escape NK cell-mediated killing. This experiment is straight forward and feasible. Another important experiment along this line would be the use the CD155/CD112 deficient tumor cells (Which the authors use in figure 4) in the experiments shown in figure 1. One would expect that tumor control by NK cells within the first 24h is absent when using these tumor cells.* Response: We previously made five CD155 mutants, which could be resistant to thrombin-mediated cleavage, and re-expressed in CD155/CD112 deficient tumor cells. However, none of the mutants was not killed by NK cells both in vivo and in vitro. It appears that the potential thrombin-cleavage site(s) reside in the recognition site by DNAM-1. We will include this observation in the discussion.
      • Figure 5: The demonstration that ERK is activated in this in vivo setting is novel. However, ERK activation is not DNAM-1 specific and the ERK inhibitor is significantly less effective that the depletion of NK cells. Therefore, the relevance of these data to the main message of the manuscript is unclear and the figure could be omitted.*

      Response: We agree that the modest effect of MEKi implies that ERK activation is dispensable for NK activation. However, ERK activation is a useful marker of NK cell activation. The data shown here vividly show the timing of NK cell activation and following tumor cell killing. Because the in vivo dynamics of NK cell activation and tumor cell killing is the most important message of this work, we wish to show this data.

      • In general, the issue of NK cell exhaustion should be addressed in more detail. The experiments do not address serial killing activity of NK cells and more data is needed to show that it is not an exhaustion of NK cells but the cleavage of CD155 from the tumor cells that prevents further killing.*

      Response: We believe, Fig. 5G clearly shows that NK cells are not exhausted 24 hours after tumor cell injection.

      **Minor comments:**

      • Figure 1C: The relevance of this experiment needs to be better explained.*

      Response: We will rephrase the result section in the revised paper.

      • Figure 3A: What does SHG stand for?*

      Response: It is shown in line 625, M&M section. We will show the statement that SHG stands for second harmonic generation channel in the figure legend.

      • Figure 3: Please add a statistical analysis for these experiments.*

      Response: We will include P values in the revised paper.

      • Figure 4: The use of the caspase-3 and the calcium sensors may detect different cytotoxic mechanisms used by the NK cells. While caspase-3 can be activated by death receptor and perforin/granzyme B mediated killing, the calcium sensor may report mostly on perforin mediated membrane damage. These killing mechanisms have different kinetics and are differentially used during serial killing by NK cells. This should be addressed (at least in the discussion).*

      Response: We thank this invaluable comment. We will include this discussion.

      Reviewer #2 (Significance (Required)):

      Investigating the in vivo cytotoxicity of NK cells against tumor cells by using live imaging technologies is highly relevant for the understanding of the dynamic relationship between tumor and killer cells. Therefore, the subject of this manuscript and the technologies used are very relevant, as in vivo killing activities do not always translate to the in vivo setting.

      Response: We thank the reviewer for the favorable comment.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      \*Summary***

      Ichise et al., present a solid work describing the modality and time frame of action of NK control over seeding metastatic cells within the lung vasculature. Th authors use a variety of technique able to dissect how NK patrol lung vasculature, that they interact with cancer cells as they interact with the endothelial cells and they activate a ERK dependent activation leading to calcium influx in cancer cells leading to their death. The data support the notion that this NK control occur over an early time frame, 4h after cancer cells arrival and is mediated by Necl expression on cancer cells. After this time point cancer cells show a thrombin dependent loss of Necl expression on their surface and therefore become resistant to NK control.

      \*Comments:***

      The data presented are supporting the conclusions. This work utilizes a variety of elegant strategy combining reporter strategy with in vivo imaging to assess the phenomenon of interaction, ERK activation, Calcium Inflax and Apoptosis activation directly in the lung.

      In term of experiments, I found the work thorough and complete.

      The data a presented well overall and the statistics seems adequate.

      I only have few suggestions:

      Supplementary Figure S3, show the use of antiLy6G to deplete neutrophils in the lungs of C57BL/6 mice injected with melanoma B16F10 cells. It was recently shown that this antibody is not efficient in depleting neutrophils in this background, but only lead neutrophils to internalise the Ly6G so they cannot be detected by FACS. As shown in Boivin et al 2020 http://doi.org/10.1038/s41467-020-16596-9) neutrophils depletion in C57BL/6 mice can be achieved by using antiGr1 antibody. Therefore, if the authors aim to show this additional control, which I also agree is really good to have, I suggest performing the experiment accordingly to the best-known practice.

      Response: We will perform the suggested experiment.

      Figure 1E: in the text the experiment is described as 4T1 Akaluc cells were inoculated into the foot pad of BALB/c mice with either control antibody or αAGM1, but the legend states that mice subcutaneously injected with B16 Akaluc cells into footpad.

      As B16 melanoma cells are not in BALB/c background, I assume the legend needs to be corrected as the cells should be 4T1, however I wonder if injecting 4T1 breast cancer cells in the footpad could have let to the substantial growth required for lung metastasis without impairing the animal mobility. Could it be that cells where actually injected in the fat pad of the mice and this is just a misspelling in the text?

      In this case, the different in the tissue residence NK cells could also potentially explain why 4T1 are not cleared in the fat pad like the B6 cells are in the footpad.

      The authors should comment on the difference in the in clearance of the cells at the injection site in Figure 1C VS Figure 1E.

      Response: We apology the erratum in the legend.

      Figure 1C was performed to examine whether NK cells in the lung could be exhausted or inert 14 days after the inoculation of B16F10 cells. In this experiment, Akaluc-expressing B16F10 cells were inoculated to monitor the bioluminescence for 24 hrs.

      In figure 1E, we used Akaluc-expressing 4T1 breast cancer cells because 4T1 cells inoculated into footpad can be spontaneously metastasized to the lung (Kamioka et al., 2017). We observed the bioluminescence of 4T1 cells in the lung for up to 20 days.

      Ref: Kamioka, Y., Takakura, K., Sumiyama, K., and Matsuda, M. (2017). Intravital FRET imaging reveals osteopontin-mediated polymorphonuclear leukocyte activation by tumor cell emboli. Cancer Sci 108, 226-235.

      Reviewer #3 (Significance (Required)):

      The present work is highly relevant to the field of cancer metastasis. While it is known that NK are responsible for the first line of defence against metastatic seeding, most of the studies focuses on how they are suppressed or influenced by other immune cells. The present study provides a very accurate description of their mechanism of action, how they depend in the interaction with the endothelial cells and highlight the novel aspect of thrombin in inducing cancer cells NK resistance. What cause thrombin activation is the next relevant question, by in my opinion this study is complete and important.

      My field of expertise is cancer metastasis and their interaction with the immune system and I personally enjoy very much reading this work.

      Response: We thank the reviewer for favorable comments and appreciate the effort to evaluate our work.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Summary

      Ichise et al., present a solid work describing the modality and time frame of action of NK control over seeding metastatic cells within the lung vasculature. Th authors use a variety of technique able to dissect how NK patrol lung vasculature, that they interact with cancer cells as they interact with the endothelial cells and they activate a ERK dependent activation leading to calcium influx in cancer cells leading to their death. The data support the notion that this NK control occur over an early time frame, 4h after cancer cells arrival and is mediated by Necl expression on cancer cells. After this time point cancer cells show a thrombin dependent loss of Necl expression on their surface and therefore become resistant to NK control.

      Comments:

      The data presented are supporting the conclusions. This work utilizes a variety of elegant strategy combining reporter strategy with in vivo imaging to assess the phenomenon of interaction, ERK activation, Calcium Inflax and Apoptosis activation directly in the lung. In term of experiments, I found the work thorough and complete. The data a presented well overall and the statistics seems adequate. I only have few suggestions:

      Supplementary Figure S3, show the use of antiLy6G to deplete neutrophils in the lungs of C57BL/6 mice injected with melanoma B16F10 cells. It was recently shown that this antibody is not efficient in depleting neutrophils in this background, but only lead neutrophils to internalise the Ly6G so they cannot be detected by FACS. As shown in Boivin et al 2020 http://doi.org/10.1038/s41467-020-16596-9) neutrophils depletion in C57BL/6 mice can be achieved by using antiGr1 antibody. Therefore, if the authors aim to show this additional control, which I also agree is really good to have, I suggest performing the experiment accordingly to the best-known practice.

      Figure 1E: in the text the experiment is described as 4T1 Akaluc cells were inoculated into the foot pad of BALB/c mice with either control antibody or αAGM1, but the legend states that mice subcutaneously injected with B16 Akaluc cells into footpad. As B16 melanoma cells are not in BALB/c background, I assume the legend needs to be corrected as the cells should be 4T1, however I wonder if injecting 4T1 breast cancer cells in the footpad could have let to the substantial growth required for lung metastasis without impairing the animal mobility. Could it be that cells where actually injected in the fat pad of the mice and this is just a misspelling in the text? In this case, the different in the tissue residence NK cells could also potentially explain why 4T1 are not cleared in the fat pad like the B6 cells are in the footpad.

      The authors should comment on the difference in the in clearance of the cells at the injection site in Figure 1C VS Figure 1E.

      Significance

      The present work is highly relevant to the field of cancer metastasis. While it is known that NK are responsible for the first line of defence against metastatic seeding, most of the studies focuses on how they are suppressed or influenced by other immune cells. The present study provides a very accurate description of their mechanism of action, how they depend in the interaction with the endothelial cells and highlight the novel aspect of thrombin in inducing cancer cells NK resistance. What cause thrombin activation is the next relevant question, by in my opinion this study is complete and important.

      My field of expertise is cancer metastasis and their interaction with the immune system and I personally enjoy very much reading this work.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      The authors use in vivo imaging techniques to investigate the killing of lung metastasis by NK cells. They demonstrate that the cleavage of CD155 may result in resistance of killing by NK cells and suggest that this could be an immune evasion mechanism of metastatic tumor cells. Overall, the subject is highly relevant, and the in vivo imaging is an interesting and highly relevant technique. However, the message, that tumor cells escape the killing by NK cells by cleavage of CD155 is interesting, but not yet fully supported by the data.

      Major comments:

      1. Figure 6: To support their main claim the authors would need to transfect the tumor cells with a CD155 mutant, which cannot be cleaved by Thrombin and show that these tumor cells can no longer escape NK cell-mediated killing. This experiment is straight forward and feasible. Another important experiment along this line would be the use the CD155/CD112 deficient tumor cells (Which the authors use in figure 4) in the experiments shown in figure 1. One would expect that tumor control by NK cells within the first 24h is absent when using these tumor cells.
      2. Figure 5: The demonstration that ERK is activated in this in vivo setting is novel. However, ERK activation is not DNAM-1 specific and the ERK inhibitor is significantly less effective that the depletion of NK cells. Therefore, the relevance of these data to the main message of the manuscript is unclear and the figure could be omitted.
      3. In general, the issue of NK cell exhaustion should be addressed in more detail. The experiments do not address serial killing activity of NK cells and more data is needed to show that it is not an exhaustion of NK cells but the cleavage of CD155 from the tumor cells that prevents further killing.

      Minor comments:

      1. Figure 1C: The relevance of this experiment needs to be better explained.
      2. Figure 3A: What does SHG stand for?
      3. Figure 3: Please add a statistical analysis for these experiments.
      4. Figure 4: The use of the caspase-3 and the calcium sensors may detect different cytotoxic mechanisms used by the NK cells. While caspase-3 can be activated by death receptor and perforin/granzyme B mediated killing, the calcium sensor may report mostly on perforin mediated membrane damage. These killing mechanisms have different kinetics and are differentially used during serial killing by NK cells. This should be addressed (at least in the discussion).

      Significance

      Investigating the in vivo cytotoxicity of NK cells against tumor cells by using live imaging technologies is highly relevant for the understanding of the dynamic relationship between tumor and killer cells. Therefore, the subject of this manuscript and the technologies used are very relevant, as in vivo killing activities do not always translate to the in vivo setting.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      The authors aimed to understand the control and the elimination of disseminated tumor cells by NK cells within the lung, their main question being how pulmonary NK cells are able to prevent tumor cells from colonization in the lung.

      To dissect this question, Hiroshi Ichise and colleagues took advantage of the ultra-sensitive bioluminescence whole body imaging system combined with intravital two-photon microscopy technology involving genetically-encoded biosensors tumor or NK cells to explore the behavior and functional competences of NK cells in an experimental lung metastasis model. First, the authors have monitored the fate of intravenously injected B16-Akaluc cells from 5 min to 10 days and observe that tumor cells decrease rapidly within the first 12-24 hours. In parallel, they performed asialoGM1+ and NK1.1+ cells depletion by injection of depleting anti-aGM1 and anti-NK1.1 antibodies in order to see the involvement of these populations on the elimination of the disseminated tumor cells. They conclude that a rapid decrease of the tumor cells is mediated by NK cells. Consisting with this first data, the authors observe also the same early NK cells mediated impact on two other syngenic mouse tumor cell lines : the BRAFV600E melanoma and the colon adenocarcinoma MC-38.

      In a second part, the authors dissected NK cell dynamic behaviors in the pulmonary capillaries by taking advantage of the NKp46iCRExrosa26dtTomato mice where NKp46+ cells are fluorescents and performed 2P intravital imaging to follow the in situ the NKp46+ cells behavior. They could nicely observe that NK cells arrive from the capillaries and patrol on the lung epithelial cells in a stall-crawl-jump manner. Moreover, they also show that the attachment to the pulmonary capillaries is mediated by LFA-1. In the presence of B16F10 tumor model, they observe that NK cells stay longer in the capillaries and increase their duration time of crawling indicating that NK cells stay in contact longer with tumor cells.

      The authors then explored the NK-mediated tumor killing in the lung by measuring tumor cell apoptosis using B16F10-SCAT3 cells (which leads to visualize caspase 3 activation) and Ca2+ influx in tumor cells expressing two Ca2+ sensors, GCaMP6s and R-GECO. They could observe casp3 activation but also Ca+ influx on tumor cells within few minutes after encountering NK cells. They also observe that evasion of NK cell surveillance is mediated by Nectin-5 and Nectin-2 expressed on tumor cells.

      Then, they focus on NK cell activation by looking at ERK activation. To do so, they have isolated NK cells from Tg mice expressing a FRET-based ERK biosensor and performed in vitro killing assay against B16-R-GECO tumor cells but also in vivo experiments. For the in vivo experiments, they have developed reporter mice whose NK cells express the FRET biosensor for ERK. They observe that ERK-dependent NK cell activation contributes to the elimination of disseminated tumor cells within the first few hours but not after 24hours. Indeed, theu observe that B16F10-Akaluc tumor cells are equally eliminated when injected 24h after a first injection of B16F10 or PBS in mice. The authors concluded that tumor cell acquire the capacity to evade NK cell surveillance after 24h rather than a hypothesis toward NK cells loose tumoricidal activity over time. Finally, the authors have explored their last result on the potential tumor cell evasion of the NK cell surveillance. They show that this NK cell evasion is mediated by the shedding of cell surface Necl-5. They next show that clivage of extracellular domain of Necl-5 was mediated by thrombin in vitro and that anti-coagulation factors such as Warfarin, Edoxaban or Dabigatran Etexilate promote tumor elimination as observed by the bioluminescence experiments. This loss prevents the NK cell signaling needed for effective killing of tumor targets. However, most of the results remain correlations and have not been formally demonstrated or miss controls. B16F10 is a well known and characterized NK cell target in a in vivo model so the first part is not really knew except the in situ behavior of NK cells within the lung capillaries. The new mecanism of thrombin-mediated shedding of Necl-5 causing evasion from NK Cell surveillance is really concentrated on the last figure (Fig N{degree sign}6) and some supplemental experiments are mandatory and needed to really confirm this affirmation.

      Significance

      There are several points to address to improve the significance of these data.

      Major points

      1) A global point : 3 mice/group is to small to analyse and interprete data because of the heterogeneity of the mice. Mean +/- SEM have to represented instead of SD.

      2) The authors used the well known polyclonal anti-asialoGM1 Ab to deplete NK cells. AsialoGM1 is also expressed by ILC1, T, NKT and gd+T cells but also basophils (Trambley J et al., Asialo GM1(+) CD8(+) T cells play a critical role in costimulation blockade-resistant allograft rejection. JCI, 1999). The authors checked the involvement only for the basophils. They have to check the depletion of each of these populations specifically in the lung to assume that the depletion impact only the NK cells or they must change their conclusion on the entire manuscrit and say that not only NK cells is responsible and involved in the control of the disseminated tumor cells but maybe also ILC1, NKT and or gd+T cells.

      3) Lines 133 to 136 : The authors say that they « did not observe any significant difference in the relative increase of the bioluminescence signal between the control and αAGM1-treated mice, implying that NK cells eliminate disseminated melanoma cells primarily in the acute phase (< 24hrs) of lung metastasis » Please comment because the depletion of asGM1+ cells impact also the growth of the tumor until 8 days (fig 1B-E-G)

      4) Fig S3A-B : The authors say that basophils express aGM1 so they performed basophils involvement on the elimination of B16F10 tumor cells with depleting aCD200R3 mab. They also checked the involvement of neutrophils and monocytes. They observed that basophils, neutrophils and monocytes are not involved on the B16F10 elimination. But what is the hypohesis to assess the role of neutrophils and monocytes ? Moreover, they did not explore Basophil roles in the other models including MC-38, BRAFV600E and 4T1 tumor cells.

      5a) Fig 1D : Missing control : the author must add the WT Balbc + a-AGM1 as control.

      5b) Lines 154 to 156 : the authors say that « T cell immunity does not contribute to tumor cell reduction » because tumor cells are eliminated in the nu/nu mice as efficiently as in the WT Balbc mice. This is not correct because they are looking in a window that correspond to innate immunity activation (up to 24h) so they cannot talk about T cell immunity, the adpative response will come more later around 8 days after.

      6) Line 159 : (refer to point #2) To affirm that NK cells is critical and involved in the elimination of the disseminated tumor, authors have to perform experiment in a model of NK cell deficiency. The most relevant nowaday is the NKp46ICRExrosa26DTA mice that are deficients in NK cells but also ILC1 cells. Indeed, the authors have used the NKp46iCre mice model for other questions.

      7a) Fig 2F : IC missing

      7b) Lines 181-182 : Authors conclude that the effect of anti-LFA-1 on NK cells adhesion to the pulmonary endothelial cells is mediated primarily by LFA-1. It is not totally true because it is partially mediated as observed in the fig 2F. So authors should change their conclusion and precise that the involvement is partially mediated by LFA-1.

      8) Fig S5B-C-D and S7: The authors talk about tumor cell death. But they are analyzing Ca2+ influx in vitro so it is a little bit different from the cell death. I'm wondering how the cell death is mesured espacially in the fig S5D and S7?

      9) Fig 4H and lines 232-233 : the authors conclude that « damage to tumor cells is dependent on the engagement of DNAM-1 on NK cells ». There is any experiment performed to affirm this point so the authors cannot maintain this conclusion. First, the authors only analyzed Ca2+ influx at a specific time point. So this result only show that Nectin-5 and/or Nectin-2 expressed by B16F10 is involved in the Ca2+ influx following NK cell contact but there is any data on DNAM-1 contribution. So, the role on the NK cells and specifically DNAM-1+ NK cells have not been adressed here. To answer to that question, the author have to perform in vivo model of engrafted WT vs Necl-5/2 ko B16F10 in a WT vs DNAM1 deficient NK cells mouse model to ascertain the contribution of Necl-5/2-DNAM-1 on NK cells. Moreover, survival curve and bioluminescent experiments would be very appreciated.

      10) Lines 253-254 : the authors talk about tumor apoptosis but they are looking at Ca2+ influx. So, they should change their conclusion or show killing experiment.

      11) Fig 6 : the authors conclude that the trombin dependent shedding of Necl-5 causes evasion of NK cells surveillance. Moreover, all experiments are correlations and do not implicate in the same experiment Necl-5, DNAM-1+ NK cells and trombin or anti-coagulation factors. So, as in the comment #9, to adress this point, the authors should inject WT vs Necl-5 deficient B16F10-Akaluc into WT vs NK cell depleted mice and monitor the bioluminescence of the tumor cells within 24h following injection of anti coagulation factors as in the fig 6H. Moreover, the monitoring of the survival curve and the number of the lung metastasis would be also very important and informative to really answer to this point.

      Minor points

      1) Fig 2E: The authors assess the involvement of LFA-1 and MAC-1 on the NK cells attachement to the the pulmonary endothelial cells. But there is other adhesion molecules that are known to be expressed by NK cells as for example CR4 (CD11c/CD18). So, the attachement of NK cells could be also due to this molecule.

      2) Lines 190 to 197 : Authors should put this methodology part in the « material and method » in order to be more clear on the message they want to deliver.

      3) line 228 : There is any hypothesis or explanation regarding the use of Necl5/Necl2 deficient B16F10. Why authors decided to go and explore this pathway ? Authors could add some transition sentence and explanation to help readers.

      4) The author could performed the same experiment as in Fig S7D and assessed ERK activation of DNAM+ vs - NK cells against WT vs Necl-5/Necl-2KO R-GEKO B16F10 cells.

      5) Line 283 : Thanks to reformulate the sentence. Check the firgures associated with the text.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      We thank the reviewers for their insightful comments and suggestions. Addressing them will improve our work. Please find below our point-by-points answers to the issues raised. We also provide a partially revised version of the manuscript with changes indicated in blue.


      Reviewer #1 (Evidence, reproducibility and clarity (Required

      **Summary**

      The authors propose a mechanism through which voltage dependent water pore formation is key to the internalization of Cell permeable peptides (CPPs). The claim is based on an in-silico study and on several experimental approaches. The authors compare 5 peptides (R9, TAT-48-57, Penetratin, MAP and Transportan and use 3 distinct cell lines (Raji, SKW6.4 and HeLa cells), plus neurons in primary cultures. The also present in vivo experiment (mouse skin and zebrafish embryo). All in all, it is an interesting study, but it raises several issues that need to be addressed. Moreover, the length and structure of the manuscript make it very difficult to read (see below under "Reviewer statement")

      **Reviewer statement**

      The instructions are to use the "Major comments" section to answer 6 precise questions. Unfortunately, this is not possible due to the structure of the document to review. The main manuscript (22 pages) comes with 4 primary figures and 19 supplemental ones. Most of these figures have an enormous number of panels and their legends occupy 17 pages. To this, are added 6 supplemental tables and 7 supplemental movies (with 2 pages of legends), 28 pages of Material and Methods, and 146 References (109 for the main manuscript and 37 for Supplemental information). To be frank, I was often tempted to send the manuscript back, asking for the authors to submit a document facilitating the task of the reviewers.

      Because of this complexity, my "Major comments" will come after a page by page, paragraph ({section sign}s) by paragraph and figure by figure "Detailed analysis" of the manuscript.

      **Detailed analysis**

      Q1. Page 4 {section sign} 3

      The test is based on the ability of TAT-RasGAP to kill the cells. Although controls exist, this is worrying since necrotic death might participate in the rupture of the membrane and artificially amplify internalization after a first physiological entry of the peptide. It is also a bit dangerous to add a FITC group to a short peptide without controlling that it has no effect on the interaction with the membrane (FITC-induced local hydrophobicity can provoke peptide tilting and membrane shearing). In the same vein, the very high peptide concentrations often used in the study (40µM for Raji and SKW6.4 cells and 80µM on HeLa cells) can be highly toxic.

      A1. We took advantage of the fact that TAT-RasGAP317-326 can kill cells to design a CRISPR/Cas9 screen based on cell survival for the identification of genes encoding proteins involved in CPP uptake. For this purpose, it was important therefore that the peptide was able to kill wild-type cells. Even if we consider the possibility that “necrotic death might participate in the rupture of the membrane and artificially amplify internalization after a first physiological entry of the peptide”, it remains that the cells that survived the screen did so because they were carrying mutations in genes that encoded potassium channels required for CPP uptake. And since the cells that survived the screen, by definition, were not dying, the issue raised by the reviewer is void in this case. The reviewer mentions that we included controls to validate the observations made with FITC-TAT-RasGAP317-326. Indeed, these controls were performed to address the potential problem raised by the reviewer. These controls, listed below, demonstrate that the genes identified through the CRISPR/Cas9 screen were also involved in the uptake of CPPs devoid of killing properties as well as CPPs that were not labelled with fluorophores.

      i) Three different cell lines, lacking specific potassium channels identified through the CRISPR/Cas9 screen, were unable to allow a non-labelled, non-toxic CPP (TAT-PNA) to enter cells (Supplementary Fig. 8a).

      ii) The Cre recombinase hooked to TAT, a construct that is not labelled with a fluorochrome and that is not toxic, did not enter Raji cells lacking the KCNQ5 potassium channels, also identified through a CRISPR/Cas9 screen (Supplementary Fig. 8b).

      iii) The internalization of a TAT-conjugated FITC-labelled cell-protective therapeutic compound was inhibited, sometimes fully, in three different cell lines, lacking specific potassium channels identified through the CRISPR/Cas9 screen (Supplementary Fig. 8c).

      Additionally, we are now reporting that the entry of FITC-labelled TAT, R9, and penetratin, all non-toxic CPPs, is impaired in Raji cells lacking the KCNQ5 potassium channel identified in the CRISPR/Cas9 screen. These new results will be incorporated in the revised version of our manuscript.

      As supportive evidence that a potential toxicity effect of TAT-RasGAP317-326 is not a confounding factor in experiments recording the initial uptake of the peptide is that internalization is measured after one hour of incubation with the cells (Figure 1), time at which the peptide only minimally impacts the survival of cells (PNAS December 15, 2020 117 31871-31881).

      Finally, please note that depolarizing cells, which is what happens in cells lacking the potassium channels identified through the CRISPR/Cas9 screen, not only blocked the uptake of TAT-RasGAP317-326, but also the uptake of a series of non-toxic CPPs (using short-time incubation protocols; Figure 2).

      Page 5 {section sign} 1

      Q2. Supp. Fig.1a shows no differences between the 3 cell types, even though they differ in their modes of peptide internalization, some favoring vesicular staining and others cytoplasmic diffusion.

      A2. The images shown in panel A of this figure depicts, for each cell line, examples of cells that do not take up the CPP, those that only display vesicular staining, and those that additionally take up the peptides in their cytosol. These images were picked to depict these uptake phenotypes and this is why they are similar in the three cells lines. Panel A does not provide any quantitative information on the prevalence of these different uptake modes in the three cell lines. This is shown in panel B of Supplementary Fig. 1. There is, therefore, no discrepancies between the two panels.

      Q3. Multiplying cell and peptide types contributes to the complexity of the manuscript without increasing its interest. If there is a conceptual breakthrough, as might be the case, it is obscured by the accumulation of useless images and data. A step into simplifying the manuscript would be (i), to concentrate on Raji cells (leaving out SKW6.4 and HeLa cells) and (ii) to only discuss the R9, TAT (including TAT-RasGAP) and Penetratin peptides.

      A3. We are sorry that the inclusion of several cell lines and several CPPs was seen as confusing by the reviewer. Our current vision is that our observations are strengthened if we show that the observed effects are seen in several cell lines and with a variety of CPPs. We would like therefore to not exclude supportive evidence presented in our work because if we do remove some of the data shown in the manuscript, we will definitely weaken some of our claims. We nevertheless remain open with this point that can be further discussed with the editors.

      Q4. TAT and R9 are poly-R peptides, which is not the case for Penetratin that has only 3 Rs. These 3 Rs are important (cannot be replaced by 3 Ks), but the two Ws absent in R9 and TAT are equally important as they cannot be replaced by Fs. This must be considered by the authors when they tend to generalize their model.

      A4. The point raised by the reviewer concerning the importance of W and R residues in CPPs is well taken. We have now developed this in the discussion with the addition of the paragraphs shown below.

      An additional potential explanation to the internalization differences observed between arginine- and lysine-rich peptides is that even though both arginine and lysine are basic amino acids, they differ in their ability to form hydrogen bonds, the guanidinium group of arginine being able to form two hydrogen bonds1** while the lysyl group of lysine can only form one. Compared to lysine, arginine would therefore form more stable electrostatic interactions with the plasma membrane.

      Cationic residues are not the only determinant in CPP direct translocation. The presence of tryptophan residues also plays important roles in the ability of CPPs to cross cellular membranes. This can be inferred from the observation that Penetratin, despite only bearing 3 arginine residues penetrates cells with similar or even greater propensities compared to R9 or TAT that contain 9 and 8 arginine residues, respectively (Supplementary Fig. 9g). The aromatic characteristics of tryptophan is not sufficient to explain how it favors direct translocation as replacing tryptophan residue with the aromatic amino acid phenylalanine decreases the translocation potency of the RW9 (RRWWRRWRR) CPP2. Rather, differences in the direct translocation promoting activities of tryptophan and phenylalanine residues may come from the higher lipid bilayer insertion capability of tryptophan compared to phenylalanine3-5. There is a certain degree of interchangeability between arginine and tryptophan residues as demonstrated by the fact that replacing up to 4 arginine residues with tryptophan amino acids in the R9 CPP preserves its ability to enter cells6. It appears that loss of positive charges that contribute to water pore formation can be compensated by acquisition of strengthened lipid interactions when arginine residues are replaced with tryptophan residues. This can explain why a limited number of arginine/tryptophan substitutions does not compromise CPP translocation through membranes**.

      Q5. Supp. Fig1c-d is not necessary (very little information in it) and Supp. Fig 1e is misleading as it takes a lot of imagination to see a difference between homogenous (top) and focal (bottom) diffusion.

      A5. Since we perform cytosolic quantitation to infer direct translocation, it appears important to us, for allowing others to potentially replicate our results, that we precisely report how methodologically we perform our experiments. For Supplementary Fig. 1e, we agree that the examples shown are not easily interpretable. We have now removed this panel, as well as the accompanying panel f, from the Supplementary Fig. 1.

      Q6. Supp. Fig.1g: How many cells are we looking at? Given the high variance, the result cannot be interpreted easily. A distribution according to fluorescence bits would be a better way to present the data.

      A6. Over 230 cells have been quantitated per condition, which includes all cells where CPP entry has occurred regardless of the intensity or the type of entry. We did not only focus on cells with strong cytosolic staining to avoid any bias with regards to detection limitations. High variance can also be explained by the fact that CPP cellular entry is not synchronized. We tested the way of showing the data as suggested by the reviewer but this did not improve the visualization of the results in our opinion. We will therefore keep the initial presentation. Note that regardless of the way the data are presented, the conclusion remains the same, namely that illumination in our hands is not the cause of CPP membrane translocation.

      Q7. Supp. Fig2i. This panel confirms that Raji cells differ from the two other cell types by showing clear temperature dependency. The explanation will come later with the energy barrier for low Vm-induced pore formation. This contradicts earlier reports showing that Penetratin translocation is not temperature-dependent, possibly because it was done on neurons naturally hyperpolarized. Or else because mechanisms are, at least in part, different from the one proposed here for R9 and TAT. This requires some clarification and supports the suggestion that, instead of multiplying models and peptides, it would be more efficient to compare TAT, R9 and Penetratin internalization by Raji cells and primary neurons.

      A7. Supplementary Fig. 1i (not Supplementary Fig. 2i as indicated by the reviewer) was reporting the overall CPP uptake, both through direct translocation and endocytosis as a function of temperature. As there is limited endocytosis in Raji cells, the data shown for this cell type mostly correspond to direct translocation. For Hela and SKW6.4, endocytosis is not marginal however and we will perform a new set of experiments to define the role of temperature (4, 20, 24, 28, 32°C) in CPP direct translocation (i.e. cytosolic acquisition) in HeLa cells and SKW6.4 (using the CPPs listed by the reviewer). We have partially performed this for HeLa cells already and this shows that direct translocation is indeed inhibited by low temperatures (more than 10-fold at 4°C compared to 37°C). Bear in mind that no endosomal escape occurs in our settings (see Supplementary Fig. 7c). This indicates that the decrease in cytoplasmic fluorescence induced by low temperature is not a consequence of diminished CPP endocytosis.

      Q8. Supp. Fig. 2a-f. Last sentence of the legend "Concentrations above 40µM led to too extensive cell death preventing analysis of peptide internalization". This confirms the warning against the use of concentrations varying between 40 µM and 80 µM and partially jeopardizes the validity of some experiments.

      A8. The reviewer has truncated this sentence that actually reads “Note: concentrations above 40 mM of TAT-RasGAP317-326 led to too extensive Raji and SKW6.4 cell death, preventing analysis of peptide internalization at these concentrations.” As different cell lines display various sensitivities to potential toxic effects induced by CPPs (Raji and SKW6.4 cells being more sensitive than HeLa cells for example), we have adapted the concentrations of CPPs used to monitor cellular uptake so that cell death was minimal or non-existent in order to prevent the potential confounding effects mentioned by the reviewer. Hence in contrast to what the reviewer is stating, we are taking care of the toxicity effect and perform our experiments in conditions were toxicity is minimal. The logic of the reviewer to state that we “jeopardize[d] the validity of some experiments” is therefore unclear to us as we did take care of not exposing our cells to toxic CPP concentrations.

      Page 6 {section sign} 2

      Q9. The authors advocate 2 modes of entry, opposing transport across the membrane and endocytosis. In contrast with R9, TAT and Penetratin, Transportan or MAP seem to be purely endocytosed but, if they reach the cytoplasm, they still have to cross a membrane (unless "a miracle happens"). For Penetratin and R9/TAT, the authors consider that water pore and inverted micelle formation are incompatible. This is a bit rapid as inverted micelles might induce water pores through W/lipids interactions requiring less R residues and, possibly, less energy. This provides the opportunity to signal that, in spite of their very high number, key references are missing or hidden in cited reviews, some of them written by colleagues who are not among the main contributors to the CPP field.

      A9. Transportan in our hands indeed appear to enter cells via endocytosis mostly. As reported by the reviewer, how Transportan reaches the cytosol remains unresolved.

      Our data support a model where CPPs enter cells via water pores that are not made by the CPPs themselves but that are created by the megapolarization state of the membrane. Our data therefore do not support toroidal or barrel-stave pore models because these pores would be built as a result of CPP assemblage.

      Inverted micelles have been hypothesized to mediate CPP translocation across membranes7 but to our knowledge, there is no in silico or cellular experimental evidence for this in the literature. To us, the data on which the involvement of inverted micelle in CPP translocation is based are also fully consistent with the water pore model. CPP translocation through water pores has been seen by several authors during in silico experiments but, to the best of our knowledge, simulations have not reported the formation of inverted micelles during CPP translocation across membranes.

      Finally, we would be grateful to this reviewer if the “key references” that are apparently missing from our manuscript are disclosed so that we could acknowledge them appropriately.

      Page 7 {section sign} 1

      Q10.Fig. 1b confirms that Raji cells provide a good model for loss and gain of function (lovely rescue experiment) and that the authors should drop the two other cell types that provide no decisive information.

      A10. Raji and HeLa cells display a stronger direct CPP uptake impairment phenotype when lacking a given potassium channel (KCNQ5 and KCNN4, respectively). In these cell lines, it appears that one potassium channel predominantly controls the plasma membrane potential. In contrast, in SKW6.4 cells, several potassium channels (e.g. KCNN4 and KCNK5) appear to be equally or redundantly involved in the control of the membrane potential. This probably explains the intermediate impact on the Vm and on CPP direct translocation when knocking out a given potassium channel in this cell line. When pharmacologically inducing cellular depolarization, a clear impairment in CPP translocation is however observed in this cell line. Thus, even though the Vm in SKW6.4 cells, is controlled predominantly by several potassium channels, it remains that an appropriate membrane potential is crucially required for these cells to take up CPP across their membrane. We agree with the reviewer that the stronger phenotypic effect observed in Raji and HeLa cells allows easy interpretation. On the other hand, it seems important to us that we provide data reporting intermediate situations so that readers can appreciate the variability that can be observed in different cell lines. Nevertheless, we would like to propose along the reviewer’s suggestion to move the SKW6.4 data from figure 1 to the supplemental data. Feedback from the editors would also be appreciated in this particular instance.

      Page 8 {section sign} 1

      Q11. A) Supp. Fig. 6b (no serum conditions) allows for the use of "normal" CPP concentrations and suggests that a fraction of the peptides may bind to serum components. No arrows in Supp. Fig.6b (but in 6c), and the R/pyrene butyrate interaction is not in 6c but in 6a. Still for Supp. Fig. 6c, the death of cells at 20µM (or less) even in the absence of K+ channels, confirms that we are borderline in term of peptide toxicity.

      B)There is a confusion between Supp. Fig. 6d and 6e and a legend problem (6e is not described). Cell death is assessed in % of PI-positive cells. Does this securely distinguish between death and holes allowing for PI entry without death?

      C) The CPP is incubated in the presence of Pyrene butyrate, making the KO cells less resistant. How does that demonstrate that the potassium channels are not involved in the killing if the peptide is already in? Unless the KO is done after internalization (but the cells should be already dead or dying?). This lacks clarity.

      A11. We apologize for the lack of clarity in the legend of Supplementary Fig. 6. This will be corrected in the revised version of the manuscript.

      A) Supp. Fig. 6b (no serum conditions) allows for the use of "normal" CPP concentrations and suggests that a fraction of the peptides may bind to serum components.

      A) The reviewer is correct that CPPs interact with serum components. This is indeed what is reported in this figure. The presence or absence of serum has therefore an important impact in experiments performed with CPPs and should be reported to allow proper interpretation of our data.

      No arrows in Supp. Fig.6b (but in 6c), and the R/pyrene butyrate interaction is not in 6c but in 6a.

      Thank you for noting this. This is now corrected.

      Still for Supp. Fig. 6c, the death of cells at 20µM (or less) even in the absence of K+ channels, confirms that we are borderline in term of peptide toxicity.

      It has to be understood that in Supplementary Fig. 6c, we use the TAT‑RasGAP317‑326 peptide that is inducing cell death when translocating into cells8. This cell death response is not provided by the CPP portion of TAT‑RasGAP317‑326 (i.e. TAT) but by its bioactive cargo (i.e. RasGAP317‑326). The read-out in this particular experiment is therefore cell death and this should not be confused with general CPP toxicity.

      B) There is a confusion between Supp. Fig. 6d and 6e and a legend problem (6e is not described).

      B) This has now been fixed.

      Cell death is assessed in % of PI-positive cells. Does this securely distinguish between death and holes allowing for PI entry without death?

      The answer to this question is yes. In this manuscript we used PI in two very different experimental set-ups.

      i) the conventional cell death detection assay where cells are incubated with 8 mg/ml PI prior to flow cytometry. In this set-up, dead cells with compromised membrane integrity have their nucleus brightly stained with PI.

      ii) the detection of small pores in the plasma membrane (water pore) where cells are incubated with ~30 mg/ml PI and the fluorescence of PI measured in the cytosol by confocal microscopy. In this set-up, PI enters into the cytosol through small plasma membrane pores but PI does not stain the DNA in the nucleus. This protocol has been previously described9 and we have further validated it in the present work (Figure 3 and Supplementary Fig. 12).

      PI does not fluoresce well unless it binds to DNA. In solution without cells, PI cannot be detected below 128 mg/ml (Supplementary Fig. 12e). At low PI concentrations (8 mg/ml), living cells (even when treated with compounds such as CPPs that create transitory pores) do not display cytosolic PI fluorescence. At high PI concentrations (32 mg/ml), the cytosol of CPP-treated cells becomes PI fluorescent. PI is positively charged and is attracted by the negative membrane potential of the cells. Its movement across the cell membrane is therefore unidirectional. This enables the PI molecules to accumulate/concentrate within the cytosol to values (> 64 mg/ml) allowing its detection (Supplementary Fig. 12a-c). PI and CPPs do no interact (Supplementary Figure 12d); hence they move independently from one another. If PI enters through the water pores induced by CPPs, the entry kinetics of PI and CPPs should be identical. Indeed, this is what we show now in a new figure (refer to our answer #31).

      C) The CPP is incubated in the presence of Pyrene butyrate, making the KO cells less resistant. How does that demonstrate that the potassium channels are not involved in the killing if the peptide is already in? Unless the KO is done after internalization (but the cells should be already dead or dying?). This lacks clarity.

      C) For the pyrene butyrate experiments the rationale was the following. The CRISPR/Cas9-identified potassium channels could either be involved in CPP internalization or they could be required for the killing activity of TAT-RasGAP317-326 when the peptide is already in the cytosol. To experimentally introduce TAT-RasGAP317-326 in the cytosol and to bypass any potential entry depending on potassium channels, we used pyrene butyrate that efficiently creates an artificial entry route for CPPs into cells. Our data show that when TAT-RasGAP317-326 is introduced in the cytosol through the use of pyrene butyrate, cells died whether they lack specific potassium channels or not. This led to our interpretation that potassium channels are not modulating the cell death activity of TAT-RasGAP317-326 once in the cytosol but that they are required for the entry of the CPP in the cytosol.

      Page 9 {section sign} 1

      Q12.The conclusion that the diffuse staining does not come from endosomal escape is based on the certainty that LLOME disrupts both endosomes and lysosomes. First, it should be verified with specific markers (rab5, rab7) that the fluorescent vesicles are endosomes. Second, the literature strongly suggests that LLOME primarily disrupts lysosomes and not endosomes. Finally, even if some endosomes are disrupted, the endosomal population is heterogenous and some CPPs may be in a subpopulation insensitive to LLOME. In addition, the importance of this issue is not well explained. In practice, access to the cytoplasm and nucleus requires crossing the plasma and/or the endosomal membrane and the latter, at least in early endosomes (thus the need of identifying the CPP-enriched vesicles), might not be very different from the plasma membrane.

      A12. The conclusion that diffuse staining does not come from endosomal escape is based on experiments where HeLa cells were incubated in the presence of CPP for 30 minutes to allow CPP entry into cells, then the cells were washed to prevent further uptake (Supplementary Fig. 7c). We only monitored the cells that initially took up the CPP by endocytosis and not through direct translocation (for the HeLa cell line, there is always a substantial fraction of such cells; see Supplementary Fig. 1b). We measured the cytosolic CPP fluorescence intensity in these cells by time-lapse confocal microscopy for 4 ½ hours. The procedure to do this is now explained in new Supplementary Fig. 7c. We then assessed the CPP fluorescence intensity within the cytosol. No increase in cytosolic fluorescence was detected in this condition, speaking against the possibility that cytosolic acquisition of CPPs by the cells resulted from vesicular escape (the identity of the vesicles being unimportant in this context). Our set-up has the potential to detect CPPs in the cytosol if these CPPs leak out from vesicles because we could measure increased CPP fluorescence in the cytosol in cells treated with LLOME. It did not matter in this positive control experiment what types of CPP-containing vesicles are disrupted by LLOME. What was important to show in this control condition was that the disruption of at least some CPP-containing vesicles permitted us to detect a cytosolic signal.

      Page 9 {section sign} 2

      Q13. Is Supp. Fig. 7e really necessary? First, as mentioned several times, if 20 µM is a borderline concentration in term of toxicity, raising the concentration up to 100 µM is problematic. Secondly, what matters is not "binding" in general, but binding to the proper membrane components. As mentioned by the authors themselves (Supp. Fig. 1e and movie), there are privileged sites of entry that may correspond to the recognition of specific molecular entities/structures.

      A13. The goal of the experiments presented in Supplementary Figure 7e was to determine whether the CRISPR/Cas9-identified potassium channels modulate CPP/membrane interaction. If those channels were to be required for the initial binding of the CPPs to the plasma membrane, this would have not hampered cells to take up the CPPs. Our data showed (Figure 7e) that Raji cells lacking the KCNQ5 potassium channel had a slightly decreased ability to bind TAT-RasGAP317-326 but importantly, these cells, at similar or even higher initial surface binding compared to wild-type cells (this was achieved by adequately varying the CPP concentrations), were still drastically impaired in taking up the peptide. Note that after one hour of incubation with TAT-RasGAP317-326 in the presence of serum there is only marginal amount of cell death (317-326, we have now performed an additional experiment with TAT that is not toxic to cells that confirms our data obtained with TAT-RasGAP317-326.

      Page 9 {section sign} 3 and Page 10 {section sign} 1

      Q14.The authors should have used a construct that does not kill the cells much earlier, just after the screening experiments based on resistance to necrosis induced by TAT-rasGAP. For Supp. Fig 8a and b: I am fully convinced by Raji cells and HeLa cells but not by the SKW6.4 cells.

      A14. As mentioned in our answer to point 10, we agree that SKW6.4 cells present intermediate phenotypes probably because, unlike Raji and HeLa cells, a combination of ion channels seems to regulate the plasma membrane potential. As indicated above, we can move the SKW6.4 data to the supplementary information to clarify the message presented in the main text. Again, feedback from the editors is welcome here.

      Page 10 {section sign} 2

      Q15. A) Supp. Fig 9 is quite convincing but adds the information that 2 µM are sufficient in neurons. This again makes the 20 to 80 µM concentrations used on transformed cells unsatisfactory.

      B) If one needs a cell line (more user friendly than primary cultures), there are several neural ones that can be differentiated (SHY, LHUMES, etc.) that may have an appropriate membrane potential (below -90mV). Indeed, it would then be important to verify if pore formation is still induced by TAT, R9 and Penetratin (separately) on "naturally" hyperpolarized cells.

      C) Figure 2a confirms that changes in Vm are not solid for HeLa and SKW6.4 cells. This casts a doubt on the validity of the results obtained with the latter 2 cell lines.

      A15. A) The experiments performed in Supplementary Fig. 9d with cortical rat neurons and HeLa cells were performed in the absence of serum accounting for the low concentrations used. We apologize for not emphasizing enough when experiments were performed in the presence or absence of serum, explaining the use of high CPP concentrations (40-80 mM) and low CPP concentrations (2-10 mM), respectively. We would like to emphasize however that we have adjusted the concentrations of CPPs in our study so as to get similar levels of CPP activity or CPP uptake between the different cell lines used. The concentrations used should not be compared as mere numbers, it is the CPP activity or uptake that should be considered.

      B) We thank the reviewer for his/her suggestion. To address this point, we will perform a new experiment to determine if in neurons TAT, R9, and Penetratin induce pores (using the PI uptake approach).

      C) Please see our answer to point 10.

      Page 11 {section sign} 2

      Q16. Why valinomycin was only tried on Raji cells?

      A16. In this study, valinomycin was used on Raji and HeLa cells (Figure 2 and 3). We did not use valinomycin on SKW6.4 cells, as the drug-induced hyperpolarization levels were insufficient in this cell line. As we got a nice hyperpolarization in HeLa wild-type and KCNN4 KO cells through ectopic expression of the KCNJ2 potassium channels (which restored the ability of the KO cells to take up the CPPs), we did not perform the CPP uptake experiment with valinomycin in HeLa cells (although we had tested that valinomycin is able to hyperpolarize HeLa cells).

      Page 12 {section sign} 2

      Q17.A)Looking at Fig. 2c, it seems that low Vm increases the uptake of all CPPs, except Transportan. Is there any reason why this Figure does not provide the number of vesicles per cell in the hyperpolarized conditions?

      B) In fact, if one goes to Supp. Fig. 9c, it appears that, among all peptides, only Penetratin is almost entirely cytoplasmic after 90' of incubation, whereas MAP and Transportan remain essentially vesicular. TAT and R9 are at mid-distance between these two extremes. This leads to send again the warning that all CPPs cannot be placed in a single category. The table that describes the sequences strongly suggests that, TAT and R9 uptake is due to the numerous Rs that cannot be replaced by Ks. In the case of Penetratin, that only has 3 Rs, the situation is thus different with the presence of 2 Ws previously shown to be mandatory for internalization, although absent in TAT ad R9.

      C) In Supp. Fig9, panel g is useless.

      D) A difference between peptides is also visible in Figure 2d where depolarization with KCl does not show the same efficiency on all peptides. The issue is whether these differences are significant and, if so, why? This discussion could be restricted to TAT, R9 and Penetratin.

      E) Supp. Fig. 10a also suggests that all peptides do not respond similarly to depolarization and that the effects differ between cell types and concentrations used. However, given the high concentrations used and the high variance between replicates, this figure might not be a priority in the reorganization of the manuscript.

      A17. A) As mentioned in the figure legend “Quantitation of vesicles was not performed in hyperpolarizing conditions due to masking from strong cytosolic signal.” This would create a bias towards underestimation of vesicles numbers in cells displaying strong cytosolic signal.

      B) We agree with the reviewer that Transportan enters cells primarily through endocytosis. This is mentioned in the text as well as other differences that were observed with regards to the prevalence of endocytosis or direct translocation. These mentions are reported below.

      Page 12: “With the notable exception of Transportan, depolarization led to decreased cytosolic fluorescence of all CPPs, while hyperpolarization favored CPP translocation in the cytosol (Fig. 2c, Supplementary Fig. 9h and 10a). Transportan, unlike the other tested CPPs, enters cells predominantly through endocytosis (Supplementary Fig. 9e), which could explain the difference in response to Vm modulation.

      Page 14: “Even though this extrapolation is likely to lack accuracy because of the well-known limitation of the MARTINI forcefield in describing the absolute kinetics of the molecular events, the values obtained are consistent with the kinetics of CPP direct translocation observed in living cells (Figure 1c and Supplementary Fig. 1b and 9e). With the exception of Transportan, the estimated CPP translocation occurred within minutes. This is consistent with our observation that Transportan enters cells predominantly through endocytosis and its internalization is therefore not affected by changes in Vm (Fig 2c-d and Supplemental Fig. 9e)”.

      Page 20: “On the other hand, when endocytosis is the predominant type of entry, CPP cytosolic uptake will be less affected by both hyperpolarization and depolarization, which is what is observed for Transportan internalization in HeLa cells (Fig. 2c and Supplementary Fig. 10a).

      Concerning the roles of arginine and tryptophan residues, please refer to our answer #4.

      C) We do not think this panel (now panel h) is useless as it shows representative examples of the quantitation shown in Figure 2c. We can however remove it if requested by the editors.

      D) The reviewer is correct with the observation that KCl-induced depolarization does not lead to similar inhibition in uptake of the tested CPPs. As mentioned in the text, these differences can be explained by the prevalence of direct translocation in the cells. For example, transportan enters cells primarily through endocytosis, which as we show is not regulated/affected by the membrane potential (Figure 2c, lower graphs). Consequently, it is expected that KCl treatment will not impact on transportan cellular uptake.

      E) The reviewer is correct in mentioning that there is quantitative heterogeneity between the different CPP tested. We mentioned these differences in the manuscript. These mentions are those that are reported under B, plus those listed below.

      Page 19: “It is known for example that peptides made of 9 lysines (K9) poorly reaches the cytosol (Fig. 3f and Supplementary Fig. 9e) and that replacing arginine by lysine in Penetratin significantly diminishes its internalization10,11. According to our model, K9 should induce megapolarization and formation of water pores that should then allow their translocation into cells. However, it has been determined that, once embedded into membranes, lysine residues tend to lose protons12,13. This will thus dissipate the strong membrane potential required for the formation of water pores and leave the lysine-containing CPPs stuck within the phospholipids of the membrane. In contrast, arginine residues are not deprotonated in membranes and water pores can therefore be maintained allowing the arginine-rich CPPs to be taken up by cells.

      Page 21: “Therefore, the uptake kinetics of lysine-rich peptide, such as MAP, appears artefactually similar as the uptake kinetics of arginine-rich peptides such as R9 (Supplementary Fig. 11b).

      Page 21: “The differences between CPPs in terms of how efficiently direct translocation is modulated by the Vm (Fig. 2c-d and Supplementary Fig. 10a) could be explained by their relative dependence on direct translocation or endocytosis to penetrate cells. The more positively charged a CPP is, the more it will enter cells through direct translocation and consequently the more sensitive it will be to cell depolarization (Fig. 2c). On the other hand, when endocytosis is the predominant type of entry, CPP cytosolic uptake will be less affected by both hyperpolarization and depolarization, which is what is observed for Transportan internalization in HeLa cells (Fig. 2c and Supplementary Fig. 10a).

      However, what remains is that depolarization always affects CPP uptake, at most concentrations tested. The heterogeneity reported in Supplementary Fig. 10a for a given experimental condition in a given cell type is in itself of interest as it suggests that there are varying factors within a cell population (e.g. cell cycle, metabolism, etc.) that may impact on the ability of cells to take up CPPs. As per reviewer’s suggestion we may remove this panel from the figure if instructed to do so by the editors.

      Page 12 {section sign} 3 and Page 13 {section sign} 1

      Q18. The pH story is either too long or too short.

      A18. One mechanism put forward to explain direct translocation relies on pH variation between the extracellular milieu and the cytosol14. It was therefore of interest in the context of the model we putting forward to see if pH is affecting the uptake of CPPs in our experimental model. Our data show that pH variations do not affect CPP direct translocation. This information should in our opinion be disclosed.

      Page 14 {section sign} 2

      Q19. At low Vm values, there is a decrease in free energy barrier. Does this modify temperature-dependency for internalization? Do cells really require energy when the Vm is very low, like is often the case for neurons?

      A19. We thank the reviewer for this interesting comment. We will now address this by visualizing under a confocal microscope CPP direct translocation in rat cortical neurons incubated at various temperature (4°C, 24°C, 37°C).

      Page 15 {section sign} 2

      Q20. Figure 2e is not explained, not even in the legend while the statement that CPPs induce a local hyperpolarization is central to the study.

      A20. As there is no Figure 2e, we believe that the reviewer is talking about Figure 3e, the legend of which was present in the initial version of the manuscript.

      Page 16 {section sign} 1

      Q21. It is confusing that the same agent, here PI, is used to measure internalization (2 nm pore formation in response to hyperpolarization,) and cell death. I have seen the explanation below, but I do not find it fully satisfactory.

      A21. We have tried to explain this better under our answer to point 11B.

      Page 16 {section sign} 2

      Q22. Entry is not necessarily a size issue. Structure is an important parameter, including possible structure changes, for example in response to Vm modifications. Therefore, the statement that molecule with larger diameters are mostly prevented from internalization is not only vague ("mostly") but incorrect.

      A22. We agree with the reviewer’s comment in the sense that the secondary structure of a molecule will also play an important role in its internalization. For that reason, we have used a series of molecules of identical structure (dextrans) but that have different molecular weights. In these experiments we saw that dextran of higher molecular weight enter less efficiently than that of lower molecular weight (Figure 3). We will rephrase some of our sentences so to precise that the size and the shape (structure) of molecules will determine their ability to enter cells through water pores that are characterized by a certain diameter.

      Page 2: “Using dyes of varying sizes and shapes, we assessed the diameter of the water pores**.

      Page 4: “translocation and we characterize the diameter of the water pores used by CPPs**.

      Page 15: “cells were co-incubated with molecules of different sizes and structure and FITC-labelled CPPs at a peptide/lipid ratio of 0.012-0.018 (Supplementary Fig. 11c-d).”

      Page 16: “3 kDa, 10 kDa, and 40 kDa dextrans, 2.3 ±0.38 nm, 4.5 nm and 8.6 nm (diameter estimation provided by Thermofisher), respectively, were used to estimate the diameter of the water pores formed in the presence of CPP.

      Page 16: “These results are in line with the in silico prediction of the water pore diameter obtained by analyzing the structure of the pore at the transition state.

      Page 16: “The marginal cytosolic co-internalization of dextrans was inversely correlated with their diameter.

      Page 35: “200 µg/ml dextran of different molecular weight in the presence or in the absence of the indicated CPPs in normal […]”.

      Page 17 {section sign} 4 and Page 18 {section sign} 1

      Q23. In Supp. Fig. 13b and c, since the GAP domain is mutated, death is not due to RasGAP activity. So what causes zebrafish death (hyperpolarization?) The results seem contradictory with those of Supp. Fig 13f where survival is 100% at 48 h.

      A23. Indeed, it appears that valinomycin in water leads to zebrafish embryo death, as can be seen in Supplementary Fig. 13c. However, the main difference between Supplementary Fig. 13c and S13f is that in Supplementary Fig. 13f zebrafish were not incubated in valinomycin-containing water, but were locally injected with a CPP in the presence or in the absence of valinomycin. This has now been clarified in the text. We saw that local injections with the hyperpolarizing agent are much less toxic and are well tolerated by the zebrafish embryos.

      Page 18 {section sign} 2

      Q24. The formation of inverted micelles is not incompatible with that of pores. CPP-induced hyperpolarization (Vm) is not measured directly, but deduced from experiments involving artificial membranes and in silico modeling. It would be useful to distinguish between what takes place on live cells (in vitro and in vivo) and what is speculated (based on modeling and artificial systems).

      A24.

      The formation of inverted micelles is not incompatible with that of pores.

      As mentioned above (point 9), we do also think that what has been presented as inverted micelles could have been in fact water pores.

      CPP-induced hyperpolarization (Vm) is not measured directly, but deduced from experiments involving artificial membranes and in silico modeling. It would be useful to distinguish between what takes place on live cells (in vitro and in vivo) and what is speculated (based on modeling and artificial systems).

      If we understand this point correctly, the reviewer is talking about the -150 mV hyperpolarization. This value is not a speculation but has been estimated from in silico experiments and also from experiments using live cells (not artificial membranes). In living cells, the hyperpolarization (megapolarization) has been estimated based on accumulation of intracellular PI over time in the presence or in the absence of CPP.

      Page 19 {section sign} 3

      Q25A. The model posits that the number of Rs influences the ability of the CPPs to hyperpolarize the membrane and, consequently, to induce pore formation. Since pore formation is key to the addressing to the cytoplasm, how can one explain that Penetratin which has only 3 Rs is transported to the cytoplasm more readily that TAT or R9? The authors should take this contradiction in consideration and should not leave aside, in the literature, what does not fit with their model.

      A25A. We fully agree that this should be discussed and not left aside. Please refer to point 4 for detailed discussion about the role of arginine and tryptophan in the ability of CPPs to translocate across membranes.

      Q25B. The fact that that Rs cannot be replaced by Ks, both in R9 and Penetratin is explained by differences in deprotonization. This is interesting but speculative. It might be that the interaction between Rs versus Ks with lipids and sugars are different and not only based on charge. After all their atomic structures, beyond charges, are different.

      A25B. We do not claim that protonation differences between R and K is the definitive answer for their ability to promote CPP translocation. It is one possible explanation that we find sound. As suggested by the reviewer, the ability of K and R to bind lipids and sugars can also play a role. We can mention in this context that the guanidinium group of arginine residues can form two hydrogen bonds1, which allow for more stable electrostatic interactions while the lysyl group of lysine residues can only form one hydrogen bond. We have included these additional possibilities in the revised version of our manuscript as indicated under point 4.

      Page 20 {section sign} 1 Q26. We still need to understand endosomal escape.

      A26. We agree with the reviewer that endosomal escape is still poorly understood. This is an interesting research topic that deserves its own separate study.

      **Major comments**

      • The key conclusions are convincing for a subset of CPPs and cell types
      • Yes, some claims should be qualified as speculative, but not preliminary
      • Many experiments should be removed. Neuronal primary cultures should be introduced to verify the main conclusions, at least for the 3 mains CPPs (TAT, R9, Penetratin). Answers must be given to the concentration issue. Vesicles should be characterized as well as the localization of the peptides in or around the vesicles. See above for less decisive but still important experiments that would benefit to the study.
      • Yes, the requested experiments correspond to a reasonable costs and amount of time (10 to 20,000 € and 3 to 5 months of work)
      • Yes, the methods are presented with great details. -Yes, the experiments are adequately replicated and statistical analysis is adequate

      **Minor comments (not so minor for some of them)**

      • See "Detailed analysis"
      • No, prior studies are not referenced appropriately (see above)
      • No, the text and figures are not clear and not accurate (see above)
      • (i) use Raji cells and primary neuronal cultures, plus in vivo model and forget the other cell types; (ii) forget MAP and Transportan and compare TAT/R9 and Penetratin; (iii) drastically reduce the number of figures, tables and movies (6 primary figures, 6 supplemental figures and 4 tables are reasonable numbers; movies are not absolutely necessary); (iv) limit to 6 (max) the number of panels per figure; (v) limit the number of references to less than 50 and cite the primary reports rather than reviews); (vi) reduce the size of the Material and Methods and the length of figure legends.

      Reviewer #1 (Significance (Required)):

      • The mode of CPP internalization is an unanswered question and the report, if revised, will represent a conceptual and technical advance.
      • Bits and pieces of the conclusions can be found in previous reports. But the Vm-dependent pore formation as well as the CPP-induced "megapolarization" (even if only shown for a subset of CPPs) would be an important contribution. The authors must resist the tentation to generalize to all CPPs what might only be true for a few of them.
      • I do not have the expertise for the in-silico work, but my field of expertise allows me to understand all other aspects of the manuscript.


      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      In this manuscript, the authors investigated the effect of membrane potential on the internalization of CPPs into the cytosol of some cancer cell lines. Using a CRISPR/Cas9-based screening, they found that some potassium channels play an important role in the internalization of CPPs. The depolarization decreases the rate of internalization of CPPs and the hyperpolarization using valinomycin increases the rate. Using the coarse-grained MD simulations, the authors investigated the interaction of CPPs with a lipid bilayer in the presence of membrane potential. In the interaction of CPPs with the cells, propidium iodide (PI) enters the cytosol significantly. Based on this result, the authors concluded that pores with 2 nm diameter are formed in the plasma membrane.

      This reviewer raises one main issue concerning CPP endocytosis. The reviewer challenges our method to investigate CPP direct translocation and specifically how do we make sure that what we consider direct translocation is not a combination of CPP endocytosis (followed or not by endosomal escape) and CPP plasma membrane translocation. As explained below in details our methodology is able to accurately distinguish CPP uptake by direct translocation from CPP endocytosis and we further demonstrate that endosomal escape does not occur in our experimental settings.

      Q27. One of the defects in this manuscript is the method to determine the fraction of internalization of CPPs via direct translocation across plasma membrane. The authors estimated the fraction of the direct translocation of CPPs by the fluorescence intensity of the cytosolic region (devoid of endosomes) and the fraction of the internalization via endocytosis by the fluorescence intensity of vesicles. However, the CPPs can enter the cytoplasm via endocytosis, and thus the increase in the fluorescence intensity of the cytoplasm is due to two processes (via endocytosis and direct translocation). The authors should use inhibitors of clathrin-mediated endocytosis and macropinocytosis to determine the fraction of internalization of CPPs via direct translocation accurately. Low temperature (4 C) has been also used as the inhibitor of endocytosis (e.g., J. Biophysics, 414729, 2011; J. Biol. Chem., 284, 33957, 2009). Supplementary Figure 1i (the temperature dependence of internalization of TAT-RasGAP317-326) clearly shows that at 4 C the fraction of the internalization was very low, indicating that this peptide enters the cytosol mainly via endocytosis. The determination of the fraction of the internalization via endocytosis by the fluorescence intensity of vesicles in this manuscript is not accurate because it is difficult to examine all endosomes in cells and it is not easy to discriminate the fluorescence intensity due to the endosomes from that due to the cytosol.

      It is important to follow a time course of the fluorescence intensity of single cells from the beginning of the interaction of CPPs with the cells (at least from 5 min) in the presence and absence of inhibitors of the endocytosis (J. Biol. Chem., 278, 585, 2003) to elucidate the process of the internalization of CPPs in the cytosol.

      A27. The reviewer raises the possibility that the signal of fluorescent CPPs in endosomes somehow perturbs the acquisition of the signal in cytosol. This could occur in two ways: CPP endosomal escape and diffusion of the signal located in endosomes into adjacent cytosolic regions (halo effect). The second possibility can be readily dismissed because in situations where cells only take up fluorescent CPPs by endocytosis, the cytosol emits background fluorescence (autofluorescence). This can be seen in Supplementary Fig. 1a (“vesicular” condition) or in Supplementary Fig. 9h in the depolarized cells that cannot take up CPP by direct translocation. Also note that when we record the cytosolic signal we take great care of using regions of interest (ROI) that are distant from endosomes. In contrast to what the reviewer is saying (“it is not easy to discriminate the fluorescence intensity due to the endosomes from that due to the cytosol”), it is actually not difficult discriminating the cytosolic fluorescence from the endosome fluorescence. To illustrate this, we now provide examples of high magnification images of cells incubated with fluorescent CPPs (new Supplementary Fig. 1c, right[1]) to better explain/illustrate our methodology and to show that it is quite straightforward to find cytosolic areas devoid of endosomes. Such high magnification images are those that are used for our blinded quantitation. The other possibility is endosomal escape. We demonstrate in Supplementary Fig. 7c that in our experimental conditions, no endosomal escape is detected[2]. We may not have explained our methodology well enough in the earlier version. We will try and improve the description of our quantitation procedures better in the revised version. To this end, we have now added a scheme illustrating the experimental setup (now part of Supplementary Fig. 7c) that is used to assess endosomal escape.

      The reviewer also questions the way we quantitate the CPP signals in endosomes. In the present paper, our goal is to characterize the direct translocation process of CPPs in to cells. We do not wish here to investigate in details the endocytic pathway taken by CPPs. This has been done in a separate study that we are currently submitting for publication. In a nutshell, this work shows that the endocytic pathway taken by CPPs is different from the classical Rab5- and Rab7-dependent pathway and that the CPP endocytic pathway is not inhibited by compounds that affect the classical pathway. Thus, even if we had wanted to use the inhibitors mentioned by the reviewer, they would not have blocked CPP endocytosis.

      To sum up the issues raised under this point, we believe we have presented the reasons why there are no grounds to support the concerns raised by the reviewer.

      [1] Supplementary Fig. 1c (right) is mentioned in the “Cell death and CPP internalization measurements” section of the methods.

      [2] In this experiment, cells were incubated with CPPs for 30 minutes to allow CPP entry into cells. Then the cells were either washed (to prevent further uptake including uptake through direct translocation) or incubated in the continued presence of CPPs. In both conditions, cells where only endocytosis took place were followed by time-lapse confocal microscopy for 4 hours (i.e. these cells do not display any cytosolic CPP signal at the beginning of the recording). We then assessed the CPP fluorescence intensity within the cytosol (i.e. away from endosomes). From these experiments we saw that cytosolic fluorescence increased only in conditions where CPP was present in the media throughout the experiment. No increase of cytosolic fluorescence was detected in the condition where CPPs were washed out. In conclusion these results demonstrate that the cytosolic signal that we observed in our experiments is due to direct translocation and not endosomal escape. In these experiments we have used the LLOME lysosomotropic agent as a control to make sure that if endosomal escape had occurred (even if only from a subset of endosomes/lysosomes), we would have been able to detect it. Indeed, upon addition of LLOME we were able to record CPP release from endosomes to the cytosol. There is therefore no endosomal escape occurring in our experimental conditions. In conclusion, the observed cytosolic signal in our confocal experiments do not originate, even partly, from endosomal escape.

      Supplementary Figure 1i (the temperature dependence of internalization of TAT-RasGAP317-326) clearly shows that at 4 C the fraction of the internalization was very low, indicating that this peptide enters the cytosol mainly via endocytosis.

      The experiment shown in Supplementary Fig. 1i was analyzed by flow cytometry that cannot discriminate the cytosolic signal from the endosomal signal. We will therefore perform this experiment again but this time using confocal imaging to record the impact of temperature on CPP cytosolic acquisition. We have performed this for HeLa cells already and this shows that direct translocation is indeed inhibited by low temperatures (full blockage at 4°C). Bear in mind that no endosomal escape occurs in our settings (see Supplementary Fig. 7c). This indicates that the decrease in cytoplasmic fluorescence induced by low temperature is not a consequence of diminished CPP endocytosis.

      Q28. Recently, it has been well recognized that membrane potential greatly affects the structure, dynamics and function of plasma membranes (e.g., Science, 349, 873, 2015; PNAS, 107, 12281, 2010). The results of the effect of membrane potential on the internalization of CPPs (depolarization decreases the rate of internalization and hyperpolarization increases the rate), which is main results of this manuscript, can be interpreted by various ways. For example, the rate of endocytosis may be greatly controlled by membrane potential, which can explain the authors' results.

      A28. This reviewer may have missed the experiment presented in Figure 2c that clearly shows that CPP endocytosis is unaffected by depolarization or hyperpolarization of cells. We have also determined that transferrin uptake through endocytosis is not affected by potassium channel knockout (which also leads to depolarization). The possibility raised by the reviewer is therefore refuted by our experimental evidence.

      Q29. A) The authors used the similar concentrations of various CPPs for their experiments (10 to 40 microM), and did not examine the peptide concentration dependence of the internalization. It has been recognized that the CPP concentration affects the mode of internalization of CPPs (e.g., J. Biol. Chem., 284, 33957, 2009). The authors should examine the peptide concentration dependence of the mode of internalization (less than 10 micorM, e. g., 1 microM).

      B) In the case of depolarization, can higher concentrations of CPPs (e.g., 100 micorM) induce their internalization?

      A29. A) We agree that CPPs/cell ratio might prompt one mode of entry over the other. It has been reported by imaging that at lower CPP concentrations endocytosis is favored since only vesicles were observed15-19. Our data confirm this (new Supplementary Fig. 9f).

      B) In Supp. Fig. 7e we have incubated KCNQ5 KO Raji cells that are slightly more depolarized than WT cells in the presence of increasing CPP concentrations up to 100 m From the obtained results, we can see that at 100 mM, the uptake in depolarized cells is increased but does not reach the level of uptake seen in wild-type cells. Therefore, lack of hyperpolarization can be compensated to a mild extent by increased CPP availability.

      Q30. A) The effects of membrane potential on plasma membranes and lipid bilayers have been extensively investigated experimentally and thus are well understood, although currently the coarse-grained MD simulations cannot provide quantitative results which can be compared with experimental results. In this manuscript, using the coarse-grained MD simulations, the authors applied 2.2 V to a lipid bilayer to examine the translocation of CPPs. However, it is well known the experimental results that application of such large voltage to a lipid bilayer induces pore formation in the membrane or its rupture (Bioelectrochem. Bioenerg., 41, 135, 1996; Sci. Rep., 7, 12509, 2017), but at low membrane potential (B) What is the probability of the existence of R9 in the surface of the membrane? R9 cannot bind to the electrically neutral lipid bilayers (such as PC) under a physiological ion concentration (Biochemistry, 55, 4154, 2016). Even if in the case of R9 the membrane potential reaches at -150 mV, the other CPPs have lower surface charge density than that of R9, and hence, the decrement of membrane potential is lower. The authors should provide the data of other CPPs.

      C) It has been reported that the negative membrane potential increases the rate of entry of two kinds of CPPs into the lumen of giant unilamellar vesicles (GUVs) without leakage of water-soluble fluorescent probe (Stokes-Einstein radius; ~0.9 nm diameter), i.e., no pore formation in the GUV membrane (Biophys., 118, 57, 2020, J. Bacteriology, 2021, DOI: 10.1128/JB.00021-21). The authors should discuss the similarity and the difference between the results in these papers and the above results in this manuscript.

      A30. A) As correctly stated by this Reviewer, we reported simulations with high transmembrane potential values, which is a common procedure in in silico simulations used to accelerate the kinetics of the studied process. In this manuscript we have additionally developed and carefully validated a novel protocol to estimate the free energy landscape of water pore formation and CPP translocation under physiological transmembrane potential (further details about the methodological procedure, the convergence and the validation of the free energy estimation are reported in Supplementary Fig. 15-19 of the manuscript). This protocol allowed us to demonstrate the impact of megapolarization (‑150 mV) on the free energy barrier corresponding to the CPP translocation process. The results exemplify how the megapolarization process modifies the uptake probability of the R9 peptide, reducing locally the free energy barrier of the membrane translocation (Fig. 3c-d). Moreover, we have also demonstrated how a single CPP produces a local transmembrane potential of about -150 mV, in agreement with our hypothesis (Fig. 3e).

      Finally, the quantitative accuracy of the molecular simulations was found to be satisfactory because the water pore formation free energy in a symmetric DOPC membrane that we calculated is in excellent agreement with previous atomistic estimation (Table S5).

      B) It has been demonstrated that CPP/membrane interactions are mostly electrostatic between positively charged amino acids carried by the CPPs and various negatively charged cell membrane components, such as glycosaminoglycans20-31 and phosphate groups32. It is in line with our model that the more positively charged CPPs are the better they should translocate into cells. Therefore, we agree with the reviewer that the level of megapolarization may vary according to the charges carried by the CPPs. However, our data clearly indicate that a certain membrane potential hyperpolarization threshold must be achieved to induce water pore formation. As suggested by the reviewer we will now conduct additional modeling experiments with other CPPs.

      C) We have carefully read these papers and do not necessarily reach the same conclusions as the authors. In both papers, the translocation of CPPs in polarized GUVs is monitored through CPP acquisition on vesicles found within the GUVs (intraluminal vesicles; either smaller GUVs or LUVs). There is actually no evidence of the presence of luminal CPPs outside of the intraluminal vesicle membranes. We would therefore argue that these studies elegantly demonstrate that membrane potential increases CPP binding and insertion into the membrane of the mother GUVs but that the CPPs then move, by diffusion, from the lipidic boundary of the mother GUVs to the lipidic membranes of its intraluminal vesicles. This CPP diffusion would presumable occur when the intraluminal vesicles touch the outer membrane bilayer of the mother GUV. There is a marked lag between binding of the CPPs to the membrane of the mother GUV and appearance of CPPs on the intraluminal vesicles (Figure 3c of the Biophysical Journal paper). This lag is, according to us, more compatible with the explanation we are giving than with a translocation mechanism. If there were direct translocation of the CPP through the membrane of the mother GUV, such a large lag would not be expected to be seen (see next point). If there is no translocation of the CPPs across the GUV membrane, it could explain why the water soluble dye within the mother GUVs does not leak out.

      Q31. The authors consider that the translocation of CPPs induces depolarization, and as a result, the pore closes immediately. This kind of transient pore cannot explain the authors' result of the significant entry of PI into the cytosol during the interaction of CPPs with the cells. The authors should explain this point.

      A31. Our interpretation is that PI takes advantage of the water pore triggered by hyperpolarization to penetrate cells. PI is positively charged and is attracted by the negative membrane potential of the cells. Its movement across the cell membrane is therefore unidirectional. This enables the PI molecules to accumulate/concentrate within the cytosol (Supplementary Fig. 12). When PI is in the presence of a CPP, both molecules enter with similar kinetics (Supplementary Fig. 12a and the new quantitation provided in the partially revised version of the manuscript; Supplementary Fig. 12b). PI and CPPs do no interact (Supplementary Figure 12d); hence they move independently from one another.

      Q32. In this manuscript, the authors used only cancer cell lines (Raji cell, SKW6.4 cell, and HeLa cell). The lipid compositions and the stability of the plasma membranes of these cells may be different from normal cells (e.g., 33; Cancer Res., 51, 3062, 1991). Is there a possibility that negatively charged lipids such as PS and PIP2 locate in the outer leaflet locally in these cells? At least, some discussions on this point is essential.

      A32. We agree with the reviewer that plasma membrane composition may vary between cancerous and not cancerous cells and that this may impact on the ability of CPPs to cross cellular membranes. We now mention this in the discussion: “While the nature of the CPPs likely dictate their uptake efficiency as discussed in the precedent paragraph, the composition of the plasma membrane could also modulate how CPPs translocate into cells. In the present work, we have recorded CPP direct translocation in transformed or cancerous cell lines as well as in primary cells. These cells display various abilities to take up CPPs by direct translocation and the present work indicates that this is modulated by their Vm. But as cancer cells display abnormal plasma membrane composition33, it will be of interest in the future to determine how important this is on their capacity to take up CPPs”.

      Q33. The authors found that PI enters the cytosol significantly when CPPs interact with these cells. Based on this result, the authors concluded that pores with 2 nm diameter are formed in the plasma membrane. However, they did not show the time courses of entry of PI and that of CPPs, and thus we cannot judge whether the pore formation in the plasma membrane is the cause of the entry of CPPs or the result of the entry of CPPs. We can reasonably consider that CPPs enters the cytosol via endocytosis and bind to the inner leaflet of the plasma membrane, inducing pore formation in the plasma membrane.

      A33. The kinetics we are now showing in point A31 indicate co-entry of CPPs and PI, an observation that is in line with our model. Also note that we have demonstrated that CPPs do not escape endosomes (please see our answers to questions 12 and 28). These data are therefore not compatible with the reviewer’s interpretation.

      Q34. It has been reported that the negative membrane potential increases the rate constant of antimicrobial peptide (AMP)-induced pore formation or local damage in the GUV membrane (J. Biol. Chem., 294, 10449, 2019; BBA-Biomembranes, 1862, 183381, 2020). These results are related to those in the present manuscript, because here the authors consider that CPPs induce pores in the plasma membrane in the presence of negative membrane potential.

      A34. We thank the reviewer for mentioning these interesting articles. As we understand them, they demonstrate that antimicrobial peptides (AMPs) bind membranes better as a function of increasing negative membrane potential and that this favors their ability to form pores in the membrane, compromising membrane integrity and inducing the release of cytosolic or luminal content. These AMPs do not behave exactly like CPPs because the latter do not compromise the integrity of the membranes.

      In conclusion, the results of the membrane potential dependence of the rate of the internalization of CPPs may be solid results, which is an important contribution. However, the other analyses and the interpretations are not conclusive at the current stage.

      We thank the reviewer for the positive assessment of our results concerning the membrane potential dependence on CPP uptake. Hopefully we have clarified the remaining points with our answers developed above and with the new data we are presenting.

      Reviewer #2 (Significance (Required)):

      (1) Using a CRISPR/Cas9-based screening, the authors found that some potassium channels play an important role in the internalization of CPP TAT-RasGAP317-326. This result advances the field of CPPs.

      (2) Several researches have suggested that the depolarization decreases the rate of internalization of CPPs into cell cytosol and the hyperpolarization increases the rate. It has been also reported that negative membrane potential increases the rate of entry of two kinds of CPPs into the lumen of GUVs of lipid bilayers. The authors provide a new genetic evidence that membrane potential plays an important role in the internalization of CPPs in the cytosol. However, modulation of membrane potential affects the structure, dynamics and function of plasma membranes greatly. At the current stage, it is difficult to judge which process of the internalization of CPPs is affected by the membrane potential.

      (3) The researchers of CPPs and AMPs are interested in their results after they improve the contents of the manuscript.

      (4) My field of expertise is membrane biophysics, especially the interaction of AMPs and CPPs with GUVs and cells.

      References

      1 Fromm, J. R., Hileman, R. E., Caldwell, E. E. O., Weiler, J. M. & Linhardt, R. J. Differences in the Interaction of Heparin with Arginine and Lysine and the Importance of these Basic Amino Acids in the Binding of Heparin to Acidic Fibroblast Growth Factor. Archives of Biochemistry and Biophysics 323, 279-287, doi:https://doi.org/10.1006/abbi.1995.9963 (1995).

      2 Derossi, D., Joliot, A. H., Chassaing, G. & Prochiantz, A. The third helix of the Antennapedia homeodomain translocates through biological membranes. The Journal of biological chemistry 269, 10444-10450 (1994).

      3 Jobin, M. L., Blanchet, M., Henry, S., Chaignepain, S., Manigand, C., Castano, S., Lecomte, S., Burlina, F., Sagan, S. & Alves, I. D. The role of tryptophans on the cellular uptake and membrane interaction of arginine-rich cell penetrating peptides. Biochim Biophys Acta 1848, 593-602, doi:10.1016/j.bbamem.2014.11.013 (2015).

      4 MacCallum, J. L., Bennett, W. F. D. & Tieleman, D. P. Distribution of amino acids in a lipid bilayer from computer simulations. Biophysical journal 94, 3393-3404, doi:10.1529/biophysj.107.112805 (2008).

      5 Christiaens, B., Symoens, S., Vanderheyden, S., Engelborghs, Y., Joliot, A., Prochiantz, A., Vandekerckhove, J., Rosseneu, M. & Vanloo, B. Tryptophan fluorescence study of the interaction of penetratin peptides with model membranes. European Journal of Biochemistry 269, 2918-2926, doi:10.1046/j.1432-1033.2002.02963.x (2002).

      6 Walrant, A., Bauza, A., Girardet, C., Alves, I. D., Lecomte, S., Illien, F., Cardon, S., Chaianantakul, N., Pallerla, M., Burlina, F., Frontera, A. & Sagan, S. Ionpair-pi interactions favor cell penetration of arginine/tryptophan-rich cell-penetrating peptides. Biochim Biophys Acta Biomembr 1862, 183098, doi:10.1016/j.bbamem.2019.183098 (2020).

      7 Derossi, D., Calvet, S., Trembleau, A., Brunissen, A., Chassaing, G. & Prochiantz, A. Cell internalization of the third helix of the Antennapedia homeodomain is receptor-independent. J Biol Chem 271, 18188-18193, doi:10.1074/jbc.271.30.18188 (1996).

      8 Serulla, M., Ichim, G., Stojceski, F., Grasso, G., Afonin, S., Heulot, M., Schober, T., Roth, R., Godefroy, C., Milhiet, P. E., Das, K., Garcia-Saez, A. J., Danani, A. & Widmann, C. TAT-RasGAP317-326 kills cells by targeting inner-leaflet-enriched phospholipids. Proc Natl Acad Sci U S A, doi:10.1073/pnas.2014108117 (2020).

      9 Bowman, A. M., Nesin, O. M., Pakhomova, O. N. & Pakhomov, A. G. Analysis of plasma membrane integrity by fluorescent detection of Tl(+) uptake. J Membr Biol 236, 15-26, doi:10.1007/s00232-010-9269-y (2010).

      10 Mitchell, D. J., Kim, D. T., Steinman, L., Fathman, C. G. & Rothbard, J. B. Polyarginine enters cells more efficiently than other polycationic homopolymers. J Pept Res 56, 318-325 (2000).

      11 Amand, H. L., Rydberg, H. A., Fornander, L. H., Lincoln, P., Norden, B. & Esbjorner, E. K. Cell surface binding and uptake of arginine- and lysine-rich penetratin peptides in absence and presence of proteoglycans. Biochim Biophys Acta 1818, 2669-2678, doi:10.1016/j.bbamem.2012.06.006 (2012).

      12 Armstrong, C. T., Mason, P. E., Anderson, J. L. & Dempsey, C. E. Arginine side chain interactions and the role of arginine as a gating charge carrier in voltage sensitive ion channels. Sci Rep 6, 21759, doi:10.1038/srep21759 (2016).

      13 Li, L., Vorobyov, I. & Allen, T. W. The different interactions of lysine and arginine side chains with lipid membranes. J Phys Chem B 117, 11906-11920, doi:10.1021/jp405418y (2013).

      14 Herce, H. D., Garcia, A. E. & Cardoso, M. C. Fundamental molecular mechanism for the cellular uptake of guanidinium-rich molecules. J Am Chem Soc 136, 17459-17467, doi:10.1021/ja507790z (2014).

      15 Kosuge, M., Takeuchi, T., Nakase, I., Jones, A. T. & Futaki, S. Cellular Internalization and Distribution of Arginine-Rich Peptides as a Function of Extracellular Peptide Concentration, Serum, and Plasma Membrane Associated Proteoglycans. Bioconjugate Chemistry 19, 656-664, doi:10.1021/bc700289w (2008).

      16 Fretz, M. M., Penning, N. A., Al-Taei, S., Futaki, S., Takeuchi, T., Nakase, I., Storm, G. & Jones, A. T. Temperature-, concentration- and cholesterol-dependent translocation of L- and D-octa-arginine across the plasma and nuclear membrane of CD34+ leukaemia cells. The Biochemical journal 403, 335-342, doi:10.1042/BJ20061808 (2007).

      17 Drin, G., Cottin, S., Blanc, E., Rees, A. R. & Temsamani, J. Studies on the internalization mechanism of cationic cell-penetrating peptides. J Biol Chem 278, 31192-31201, doi:10.1074/jbc.M303938200 (2003).

      18 Duchardt, F., Fotin‐Mleczek, M., Schwarz, H., Fischer, R. & Brock, R. A Comprehensive Model for the Cellular Uptake of Cationic Cell‐penetrating Peptides. Traffic 8, 848-866, doi:10.1111/j.1600-0854.2007.00572.x (2007).

      19 Ziegler, A., Nervi, P., Dürrenberger, M. & Seelig, J. The Cationic Cell-Penetrating Peptide CPPTAT Derived from the HIV-1 Protein TAT Is Rapidly Transported into Living Fibroblasts:  Optical, Biophysical, and Metabolic Evidence. Biochemistry 44, 138-148, doi:10.1021/bi0491604 (2005).

      20 Ziegler, A. Thermodynamic studies and binding mechanisms of cell-penetrating peptides with lipids and glycosaminoglycans. Advanced Drug Delivery Reviews 60, 580-597, doi:https://doi.org/10.1016/j.addr.2007.10.005 (2008).

      21 Rullo, A., Qian, J. & Nitz, M. Peptide–glycosaminoglycan cluster formation involving cell penetrating peptides. Biopolymers 95, 722-731, doi:10.1002/bip.21641 (2011).

      22 Bechara, C., Pallerla, M., Zaltsman, Y., Burlina, F., Alves, I. D., Lequin, O. & Sagan, S. Tryptophan within basic peptide sequences triggers glycosaminoglycan-dependent endocytosis. The FASEB Journal 27, 738-749, doi:10.1096/fj.12-216176 (2013).

      23 Gonçalves, E., Kitas, E. & Seelig, J. Binding of Oligoarginine to Membrane Lipids and Heparan Sulfate:  Structural and Thermodynamic Characterization of a Cell-Penetrating Peptide. Biochemistry 44, 2692-2702, doi:10.1021/bi048046i (2005).

      24 Rusnati, M., Tulipano, G., Spillmann, D., Tanghetti, E., Oreste, P., Zoppetti, G., Giacca, M. & Presta, M. Multiple Interactions of HIV-I Tat Protein with Size-defined Heparin Oligosaccharides. Journal of Biological Chemistry 274, 28198-28205, doi:10.1074/jbc.274.40.28198 (1999).

      25 Butterfield, K. C., Caplan, M. & Panitch, A. Identification and Sequence Composition Characterization of Chondroitin Sulfate-Binding Peptides through Peptide Array Screening. Biochemistry 49, 1549-1555, doi:10.1021/bi9021044 (2010).

      26 Åmand, H. L., Rydberg, H. A., Fornander, L. H., Lincoln, P., Nordén, B. & Esbjörner, E. K. Cell surface binding and uptake of arginine- and lysine-rich penetratin peptides in absence and presence of proteoglycans. Biochimica et Biophysica Acta (BBA) - Biomembranes 1818, 2669-2678, doi:https://doi.org/10.1016/j.bbamem.2012.06.006 (2012).

      27 Ghibaudi, E., Boscolo, B., Inserra, G., Laurenti, E., Traversa, S., Barbero, L. & Ferrari, R. P. The interaction of the cell-penetrating peptide penetratin with heparin, heparansulfates and phospholipid vesicles investigated by ESR spectroscopy. Journal of Peptide Science 11, 401-409, doi:10.1002/psc.633 (2005).

      28 Fuchs, S. M. & Raines, R. T. Pathway for polyarginine entry into mammalian cells. Biochemistry 43, 2438-2444, doi:10.1021/bi035933x (2004).

      29 Ziegler, A. & Seelig, J. Contributions of Glycosaminoglycan Binding and Clustering to the Biological Uptake of the Nonamphipathic Cell-Penetrating Peptide WR9. Biochemistry 50, 4650-4664, doi:10.1021/bi1019429 (2011).

      30 Ziegler, A. & Seelig, J. Interaction of the Protein Transduction Domain of HIV-1 TAT with Heparan Sulfate: Binding Mechanism and Thermodynamic Parameters. Biophysical Journal 86, 254-263, doi:https://doi.org/10.1016/S0006-3495(04)74101-6 (2004).

      31 Hakansson, S. & Caffrey, M. Structural and Dynamic Properties of the HIV-1 Tat Transduction Domain in the Free and Heparin-Bound States. Biochemistry 42, 8999-9006, doi:10.1021/bi020715+ (2003).

      32 Kawamoto, S., Takasu, M., Miyakawa, T., Morikawa, R., Oda, T., Futaki, S. & Nagao, H. Inverted micelle formation of cell-penetrating peptide studied by coarse-grained simulation: importance of attractive force between cell-penetrating peptides and lipid head group. J Chem Phys 134, 095103, doi:10.1063/1.3555531 (2011).

      33 Szlasa, W., Zendran, I., Zalesinska, A., Tarek, M. & Kulbacka, J. Lipid composition of the cancer cell membrane. J Bioenerg Biomembr 52, 321-342, doi:10.1007/s10863-020-09846-4 (2020).

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      In this manuscript, the authors investigated the effect of membrane potential on the internalization of CPPs into the cytosol of some cancer cell lines. Using a CRISPR/Cas9-based screening, they found that some potassium channels play an important role in the internalization of CPPs. The depolarization decreases the rate of internalization of CPPs and the hyperpolarization using valinomycin increases the rate. Using the coarse-grained MD simulations, the authors investigated the interaction of CPPs with a lipid bilayer in the presence of membrane potential. In the interaction of CPPs with the cells, propidium iodide (PI) enters the cytosol significantly. Based on this result, the authors concluded that pores with 2 nm diameter are formed in the plasma membrane.

      One of the defects in this manuscript is the method to determine the fraction of internalization of CPPs via direct translocation across plasma membrane. The authors estimated the fraction of the direct translocation of CPPs by the fluorescence intensity of the cytosolic region (devoid of endosomes) and the fraction of the internalization via endocytosis by the fluorescence intensity of vesicles. However, the CPPs can enter the cytoplasm via endocytosis, and thus the increase in the fluorescence intensity of the cytoplasm is due to two processes (via endocytosis and direct translocation). The authors should use inhibitors of clathrin-mediated endocytosis and macropinocytosis to determine the fraction of internalization of CPPs via direct translocation accurately. Low temperature (4 C) has been also used as the inhibitor of endocytosis (e.g., J. Biophysics, 414729, 2011; J. Biol. Chem., 284, 33957, 2009). Supplementary Figure 1i (the temperature dependence of internalization of TAT-RasGAP317-326) clearly shows that at 4 C the fraction of the internalization was very low, indicating that this peptide enters the cytosol mainly via endocytosis. The determination of the fraction of the internalization via endocytosis by the fluorescence intensity of vesicles in this manuscript is not accurate because it is difficult to examine all endosomes in cells and it is not easy to discriminate the fluorescence intensity due to the endosomes from that due to the cytosol.

      It is important to follow a time course of the fluorescence intensity of single cells from the beginning of the interaction of CPPs with the cells (at least from 5 min) in the presence and absence of inhibitors of the endocytosis (J. Biol. Chem., 278, 585, 2003) to elucidate the process of the internalization of CPPs in the cytosol

      Recently, it has been well recognized that membrane potential greatly affects the structure, dynamics and function of plasma membranes (e.g., Science, 349, 873, 2015; PNAS, 107, 12281, 2010). The results of the effect of membrane potential on the internalization of CPPs (depolarization decreases the rate of internalization and hyperpolarization increases the rate), which is main results of this manuscript, can be interpreted by various ways. For example, the rate of endocytosis may be greatly controlled by membrane potential, which can explain the authors' results.

      The authors used the similar concentrations of various CPPs for their experiments (10 to 40 microM), and did not examine the peptide concentration dependence of the internalization. It has been recognized that the CPP concentration affects the mode of internalization of CPPs (e.g., J. Biol. Chem., 284, 33957, 2009). The authors should examine the peptide concentration dependence of the mode of internalization (less than 10 micorM, e. g., 1 microM). In the case of depolarization, can higher concentrations of CPPs (e.g., 100 micorM) induce their internalization? The effects of membrane potential on plasma membranes and lipid bilayers have been extensively investigated experimentally and thus are well understood, although currently the coarse-grained MD simulations cannot provide quantitative results which can be compared with experimental results. In this manuscript, using the coarse-grained MD simulations, the authors applied 2.2 V to a lipid bilayer to examine the translocation of CPPs. However, it is well known the experimental results that application of such large voltage to a lipid bilayer induces pore formation in the membrane or its rupture (Bioelectrochem. Bioenerg., 41, 135, 1996; Sci. Rep., 7, 12509, 2017), but at low membrane potential (< ~200 mV) a lipid bilayer is stable although they have transient pre-pores (Biophys. J., 85, 2342, 2003; Biophys. J., 80, 1829, 2001). In the main results obtained in the experiments of cells in this manuscript, the values of the membrane potential are less than -100 mV. Therefore, this description of their results of MD simulations is misleading. According to their theory, only at much lower Vm values (-150 mV) induces a large decrease in free energy barrier of the translocation of CPPs across the lipid bilayer. Is this due to the pore formation in the membrane? The description of this point is poor, and thus it is diffiult to understand it. The authors consider that normal membrane potential is much higher than -150 mV, and thus the binding of positively charged CPPs in the membrane surface must increase the negative membrane potential to decrease the free energy barrier. Then, the authors suggest that the presence of R9 in contact with lipid membrane decreases the transmembrane potential to -150 mV according to the MD calculation. What is the probability of the existence of R9 in the surface of the membrane? R9 cannot bind to the electrically neutral lipid bilayers (such as PC) under a physiological ion concentration (Biochemistry, 55, 4154, 2016). Even if in the case of R9 the membrane potential reaches at -150 mV, the other CPPs have lower surface charge density than that of R9, and hence, the decrement of membrane potential is lower. The authors should provide the data of other CPPs. It has been reported that the negative membrane potential increases the rate of entry of two kinds of CPPs into the lumen of giant unilamellar vesicles (GUVs) without leakage of water-soluble fluorescent probe (Stokes-Einstein radius; ~0.9 nm diameter), i.e., no pore formation in the GUV membrane (Biophys., 118, 57, 2020, J. Bacteriology, 2021, DOI: 10.1128/JB.00021-21). The authors should discuss the similarity and the difference between the results in these papers and the above results in this manuscript.

      The authors consider that the translocation of CPPs induces depolarization, and as a result, the pore closes immediately. This kind of transient pore cannot explain the authors' result of the significant entry of PI into the cytosol during the interaction of CPPs with the cells. The authors should explain this point.

      In this manuscript, the authors used only cancer cell lines (Raji cell, SKW6.4 cell, and HeLa cell). The lipid compositions and the stability of the plasma membranes of these cells may be different from normal cells (e.g., J. Bioenergetics Biomem., 52, 321, 2020; Cancer Res., 51, 3062, 1991). Is there a possibility that negatively charged lipids such as PS and PIP2 locate in the outer leaflet locally in these cells? At least, some discussions on this point is essential.

      The authors found that PI enters the cytosol significantly when CPPs interact with these cells. Based on this result, the authors concluded that pores with 2 nm diameter are formed in the plasma membrane. However, they did not show the time courses of entry of PI and that of CPPs, and thus we cannot judge whether the pore formation in the plasma membrane is the cause of the entry of CPPs or the result of the entry of CPPs. We can reasonably consider that CPPs enters the cytosol via endocytosis and bind to the inner leaflet of the plasma membrane, inducing pore formation in the plasma membrane.

      It has been reported that the negative membrane potential increases the rate constant of antimicrobial peptide (AMP)-induced pore formation or local damage in the GUV membrane (J. Biol. Chem., 294, 10449, 2019; BBA-Biomembranes, 1862, 183381, 2020). These results are related to those in the present manuscript, because here the authors consider that CPPs induce pores in the plasma membrane in the presence of negative membrane potential.

      In conclusion, the results of the membrane potential dependence of the rate of the internalization of CPPs may be solid results, which is an important contribution. However, the other analyses and the interpretations are not conclusive at the current stage.

      Significance

      (1) Using a CRISPR/Cas9-based screening, the authors found that some potassium channels play an important role in the internalization of CPP TAT-RasGAP317-326. This result advances the field of CPPs.

      (2) Several researches have suggessted that the depolarization decreases the rate of internalization of CPPs into cell cytosol and the hyperpolarization increases the rate. It has been also reported that negative membrane potential increases the rate of entry of two kinds of CPPs into the lumen of GUVs of lipid bilayers. The authors provide a new genetic evidence that membrane potential plays an important role in the internalization of CPPs in the cytosol. However, modulation of membrane potential affects the structure, dynamics and function of plasma membranes greatly. At the current stage, it is difficult to judge which process of the internalization of CPPs is affected by the membrane potential.

      (3) The researchers of CPPs and AMPs are interested in their results after they improve the contents of the manuscript.

      (4) My field of expertise is membrane biophysics, especially the interaction of AMPs and CPPs with GUVs and cells.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary

      The authors propose a mechanism through which voltage dependent water pore formation is key to the internalization of Cell permeable peptides (CPPs). The claim is based on an in-silico study and on several experimental approaches. The authors compare 5 peptides (R9, TAT-48-57, Penetratin, MAP and Transportan and use 3 distinct cell lines (Raji, SKW6.4 and HeLa cells), plus neurons in primary cultures. The also present in vivo experiment (mouse skin and zebrafish embryo). All in all, it is an interesting study, but it raises several issues that need to be addressed. Moreover, the length and structure of the manuscript make it very difficult to read (see below under "Reviewer statement")

      Reviewer statement

      The instructions are to use the "Major comments" section to answer 6 precise questions. Unfortunately, this is not possible due to the structure of the document to review. The main manuscript (22 pages) comes with 4 primary figures and 19 supplemental ones. Most of these figures have an enormous number of panels and their legends occupy 17 pages. To this, are added 6 supplemental tables and 7 supplemental movies (with 2 pages of legends), 28 pages of Material and Methods, and 146 References (109 for the main manuscript and 37 for Supplemental information). To be frank, I was often tempted to send the manuscript back, asking for the authors to submit a document facilitating the task of the reviewers.

      Because of this complexity, my "Major comments" will come after a page by page, paragraph ({section sign}) by paragraph and figure by figure "Detailed analysis" of the manuscript.

      Detailed analysis

      Page 4 {section sign} 3 The test is based on the ability of TAT-RasGAP to kill the cells. Although controls exist, this is worrying since necrotic death might participate in the rupture of the membrane and artificially amplify internalization after a first physiological entry of the peptide. It is also a bit dangerous to add a FITC group to a short peptide without controlling that it has no effect on the interaction with the membrane (FITC-induced local hydrophobicity can provoke peptide tilting and membrane shearing). In the same vein, the very high peptide concentrations often used in the study (40µM for Raji and SKW6.4 cells and 80µM on HeLa cells) can be highly toxic.

      Page 5 {section sign} 1 Supp. Fig.1a shows no differences between the 3 cell types, even though they differ in their modes of peptide internalization, some favoring vesicular staining and others cytoplasmic diffusion. Multiplying cell and peptide types contributes to the complexity of the manuscript without increasing its interest. If there is a conceptual breakthrough, as might be the case, it is obscured by the accumulation of useless images and data. A step into simplifying the manuscript would be (i), to concentrate on Raji cells (leaving out SKW6.4 and HeLa cells) and (ii) to only discuss the R9, TAT (including TAT-RasGAP) and Penetratin peptides. TAT and R9 are poly-R peptides, which is not the case for Penetratin that has only 3 Rs. These 3 Rs are important (cannot be replaced by 3 Ks), but the two Ws absent in R9 and TAT are equally important as they cannot be replaced by Fs. This must be considered by the authors when they tend to generalize their model. Supp. Fig1c-d is not necessary (very little information in it) and Supp. Fig 1e is misleading as it takes a lot of imagination to see a difference between homogenous (top) and focal (bottom) diffusion. Supp. Fig.1g: How many cells are we looking at? Given the high variance, the result cannot be interpreted easily. A distribution according to fluorescence bits would be a better way to present the data.

      Supp. Fig2i. This panel confirms that Raji cells differ from the two other cell types by showing clear temperature dependency. The explanation will come later with the energy barrier for low Vm-induced pore formation. This contradicts earlier reports showing that Penetratin translocation is not temperature-dependent, possibly because it was done on neurons naturally hyperpolarized. Or else because mechanisms are, at least in part, different from the one proposed here for R9 and TAT. This requires some clarification and supports the suggestion that, instead of multiplying models and peptides, it would be more efficient to compare TAT, R9 and Penetratin internalization by Raji cells and primary neurons. Supp. Fig. 2a-f. Last sentence of the legend "Concentrations above 40µM led to too extensive cell death preventing analysis of peptide internalization". This confirms the warning against the use of concentrations varying between 40µM and 80µ and partially jeopardizes the validity of some experiments.

      Page 6 {section sign} 2 The authors advocate 2 modes of entry, opposing transport across the membrane and endocytosis. In contrast with R9, TAT and Penetratin, Transportan or MAP seem to be purely endocytosed but, if they reach the cytoplasm, they still have to cross a membrane (unless "a miracle happens"). For Penetratin and R9/TAT, the authors consider that water pore and inverted micelle formation are incompatible. This is a bit rapid as inverted micelles might induce water pores through W/lipids interactions requiring less R residues and, possibly, less energy. This provides the opportunity to signal that, in spite of their very high number, key references are missing or hidden in cited reviews, some of them written by colleagues who are not among the main contributors to the CPP field.

      Page 7 {section sign} 1 Fig. 1b confirms that Raji cells provide a good model for loss and gain of function (lovely rescue experiment) and that the authors should drop the two other cell types that provide no decisive information.

      Page 8 {section sign} 1 Supp. Fig. 6b (no serum conditions) allows for the use of "normal" CPP concentrations and suggests that a fraction of the peptides may bind to serum components. No arrows in Supp. Fig.6b (but in 6c), and the R/pyrene butyrate interaction is not in 6c but in 6a. Still for Supp. Fig. 6c, the death of cells at 20µM (or less) even in the absence of K+ channels, confirms that we are borderline in term of peptide toxicity. There is a confusion between Supp. Fig. 6d and 6e and a legend problem (6e is not described). Cell death is assessed in % of PI-positive cells. Does this securely distinguish between death and holes allowing for PI entry without death? The CPP is incubated in the presence of Pyrene butyrate, making the KO cells less resistant. How does that demonstrate that the potassium channels are not involved in the killing if the peptide is already in? Unless the KO is done after internalization (but the cells should be already dead or dying?). This lacks clarity.

      Page 9 {section sign} 1 The conclusion that the diffuse staining does not come from endosomal escape is based on the certainty that LLOME disrupts both endosomes and lysosomes. First, it should be verified with specific markers (rab5, rab7) that the fluorescent vesicles are endosomes. Second, the literature strongly suggests that LLOME primarily disrupts lysosomes and not endosomes. Finally, even if some endosomes are disrupted, the endosomal population is heterogenous and some CPPs may be in a subpopulation insensitive to LLOME. In addition, the importance of this issue is not well explained. In practice, access to the cytoplasm and nucleus requires crossing the plasma and/or the endosomal membrane and the latter, at least in early endosomes (thus the need of identifying the CPP-enriched vesicles), might not be very different from the plasma membrane. Page 9 {section sign} 2 Is Supp. Fig. 7e really necessary? First, as mentioned several times, if 20µM is a borderline concentration in term of toxicity, raising the concentration up to 100µM is problematic. Secondly, what matters is not "binding" in general, but binding to the proper membrane components. As mentioned by the authors themselves (Supp. Fig. 1e and movie), there are privileged sites of entry that may correspond to the recognition of specific molecular entities/structures.

      Page 9 {section sign} 3 and Page 10 {section sign} 1 The authors should have used a construct that does not kill the cells much earlier, just after the screening experiments based on resistance to necrosis induced by TAT-rasGAP. For Supp. Fig 8a and b: I am fully convinced by Raji cells and HeLa cells but not by the SKW6.4 cells..

      Page 10 {section sign} 2 Supp. Fig 9 is quite convincing but adds the information that 2µM are sufficient in neurons. This again makes the 20 to 80µM concentrations used on transformed cells unsatisfactory. If one needs a cell line (more user friendly than primary cultures), there are several neural ones that can be differentiated (SHY, LHUMES, etc.) that may have an appropriate membrane potential (below -90mV). Indeed, it would then be important to verify if pore formation is still induced by TAT, R9 and Penetratin (separately) on "naturally" hyperpolarized cells. Figure 2a confirms that changes in Vm are not solid for HeLa and SKW6.4 cells. This casts a doubt on the validity of the results obtained with the latter 2 cell lines.

      Page 11 {section sign} 2 Why valinomycin was only tried on Raji cells?

      Page 12 {section sign} 2 Looking at Fig. 2c, it seems that low Vm increases the uptake of all CPPs, except Transportan. Is there any reason why this Figure does not provide the number of vesicles per cell in the hyperpolarized conditions? In fact, if one goes to Supp. Fig. 9c, it appears that, among all peptides, only Penetratin is almost entirely cytoplasmic after 90' of incubation, whereas MAP and Transportan remain essentially vesicular. TAT and R9 are at mid-distance between these two extremes. This leads to send again the warning that all CPPs cannot be placed in a single category. The table that describes the sequences strongly suggests that, TAT and R9 uptake is due to the numerous Rs that cannot be replaced by Ks. In the case of Penetratin, that only has 3 Rs, the situation is thus different with the presence of 2 Ws previously shown to be mandatory for internalization, although absent in TAT ad R9. In Supp. Fig9, panel g is useless. A difference between peptides is also visible in Figure 2d where depolarization with KCl does not show the same efficiency on all peptides. The issue is whether these differences are significant and, if so, why? This discussion could be restricted to TAT, R9 and Penetratin. Supp. Fig. 10a also suggests that all peptides do not respond similarly to depolarization and that the effects differ between cell types and concentrations used. However, given the high concentrations used and the high variance between replicates, this figure might not be a priority in the reorganization of the manuscript.

      Page 12 {section sign} 3 and Page 13 {section sign} 1 The pH story is either too long or too short.

      Page 14 {section sign} 2 At low Vm values, there is a decrease in free energy barrier. Does this modify temperature-dependency for internalization? Do cells really require energy when the Vm is very low, like is often the case for neurons?

      Page 15 {section sign} 2 Figure 2e is not explained, not even in the legend while the statement that CPPs induce a local hyperpolarization is central to the study.

      Page 16 {section sign} 1 It is confusing that the same agent, here PI, is used to measure internalization (2 nm pore formation in response to hyperpolarization,) and cell death. I have seen the explanation below, but I do not find it fully satisfactory.

      Page 16 {section sign} 2 Entry is not necessarily a size issue. Structure is an important parameter, including possible structure changes, for example in response to Vm modifications. Therefore, the statement that molecule with larger diameters are mostly prevented from internalization is not only vague ("mostly") but incorrect.

      Page 17 {section sign} 4 and Page 18 {section sign} 1 In Supp. Fig. 13b and c, since the GAP domain is mutated, death is not due to RasGAP activity. So what causes zebrafish death (hyperpolarization?) The results seem contradictory with those of Supp. Fig 13f where survival is 100% at 48 h.

      Page 18 {section sign} 2 The formation of inverted micelles is not incompatible with that of pores. CPP-induced hyperpolarization (Vm) is not measured directly, but deduced from experiments involving artificial membranes and in silico modeling. It would be useful to distinguish between what takes place on live cells (in vitro and in vivo) and what is speculated (based on modeling and artificial systems).

      Page 19 {section sign} 3 The model posits that the number of Rs influences the ability of the CPPs to hyperpolarize the membrane and, consequently, to induce pore formation. Since pore formation is key to the addressing to the cytoplasm, how can one explain that Penetratin which has only 3 Rs is transported to the cytoplasm more readily that TAT or R9? The authors should take this contradiction in consideration and should not leave aside, in the literature, what does not fit with their model. The fact that that Rs cannot be replaced by Ks, both in R9 and Penetratin is explained by differences in deprotonization. This is interesting but speculative. It might be that the interaction between Rs versus Ks with lipids and sugars are different and not only based on charge. After all their atomic structures, beyond charges, are different.

      Page 20 {section sign} 1 We still need to understand endosomal escape.

      Major comments

      • The key conclusions are convincing for a subset of CPPs and cell types
      • Yes, some claims should be qualified as speculative, but not preliminary
      • Many experiments should be removed. Neuronal primary cultures should be introduced to verify the main conclusions, at least for the 3 mains CPPs (TAT, R9, Penetratin). Answers must be given to the concentration issue. Vesicles should be characterized as well as the localization of the peptides in or around the vesicles. See above for less decisive but still important experiments that would benefit to the study.
      • Yes, the requested experiments correspond to a reasonable costs and amount of time (10 to 20,000 € and 3 to 5 months of work)
      • Yes, the methods are presented with great details. -Yes, the experiments are adequately replicated and statistical analysis is adequate

      Minor comments (not so minor for some of them)

      • See "Detailed analysis"
      • No, prior studies are not referenced appropriately (see above)
      • No, the text and figures are not clear and not accurate (see above)
      • (i) use Raji cells and primary neuronal cultures, plus in vivo model and forget the other cell types; (ii) forget MAP and Transportan and compare TAT/R9 and Penetratin; (iii) drastically reduce the number of figures, tables and movies (6 primary figures, 6 supplemental figures and 4 tables are reasonable numbers; movies are not absolutely necessary); (iv) limit to 6 (max) the number of panels per figure; (v) limit the number of references to less than 50 and cite the primary reports rather than reviews); (vi) reduce the size of the Material and Methods and the length of figure legends.

      Significance

      • The mode of CPP internalization is an unanswered question and the report, if revised, will represent a conceptual and technical advance.
        • Bits and pieces of the conclusions can be found in previous reports. But the Vm-dependent pore formation as well as the CPP-induced "megapolarization" (even if only shown for a subset of CPPs) would be an important contribution. The authors must resist the tentation to generalize to all CPPs what might only be true for a few of them.
        • I do not have the expertise for the in-silico work, but my field of expertise allows me to understand all other aspects of the manuscript.
    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1 (Evidence, reproducibility and clarity (Required)): **Summary:** Previously, the authors showed the importance of contractile force in cell positioning and cell fate specification in preimplantation mouse development. In this study, the authors generated maternal-zygotic mutants of the non-muscle myosin-II heavy chain (NMHC) genes Myh9 and Myh10, and quantitatively analyzed their development using time-lapse microscopy and immunostaining. The authors first examined the expression of NMHCs. Myh9 and Mhy10 are present in preimplantation embryos, and Myh9 is maternally inherited. Single maternal-zygotic mutants of Myh9 or Myh10 revealed that maternal Myh9 plays a major role in actomyosin contractility. In maternal Myh9 mutants, compaction and contractility at the 8-cell stage were reduced. Maternal Myh9 mutants demonstrated a longer 8-cell stage, and mutant blastocysts had reduced cell numbers. Cell positioning was not affected; however, cell differentiation was slightly affected by reduced expression of TE and ICM markers. Maternal Myh9 mutants formed blastocoels, but lumen opening was observed earlier than that in wild-type embryos. In double maternal-zygotic mutants of Myh9 and Myh10, cytokinesis was severely affected. Nevertheless, TE fate was specified and embryos formed blastocoels. Interestingly, single-celled mutants swelled upon the formation of fluid-filled vacuoles in their cytoplasm. Similar TE fate specifications and cytoplasmic vacuoles were also observed with single-celled embryos produced by blastomere fusion. Based on these results, the authors concluded that maternal Myh9 is the major NMHC. However, Myh10 can significantly compensate for the loss of Myh9, and that cell fate specification and morphogenesis are independent of the success of cell division. **Minor comments:** Overall, the conclusions of this study are supported by high-quality data. However, I have a few minor concerns:

      We thank the referee for her/his careful analysis of our manuscript.

      1. Line 200~205. The authors showed the correlation between the cell number at the blastocyst stage and the 8-cell stage, and concluded that "the lengthened 8-cell stage of mzMyh9 is an important determinant of their reduced cell number at the blastocyst stage". This conclusion is not well supported because of several reasons. First, the timing of cell count is not clear. Cell number was compared at the blastocyst stage, but Figure 1c shows that mzMyh9 embryos initiate blastocoel formation earlier than wild-type embryos. Therefore, if cell count timing was determined based on the blastocyst morphology of the embryos, the timing of cell count (i.e., time after 3rd cleavage) for mzMyh9 mutants is earlier than that observed for wild-type embryos. This shorter culture time likely contributes to the reduced cell number of mzMyh9. Second, the authors only showed a correlation, and no experimental data supporting this conclusion were shown. If the cell number was counted at the same time after the 3rd cleavage, and if the authors' hypothesis is correct, then culturing mzMyh9 mutants for an additional three hour, which is the difference in the duration of the 8-cell stage, should make the cell numbers of mutants comparable to those of wild-type blastocysts.

      Although, this correlation provides the best explanation we had based on the data, we agree that the statement above is weakly supported by our study. We do not want to make a strong point about it since we do not think it brings much to the narrative of the study. We have removed the sentence.

      Discussion. In the paragraph starting from line 405, the authors discussed the inconsistencies in the observation of the phenotypes of mzMyh9 and mzMyh10 mutants with the conclusions of previous studies by others about cell polarization. It will be informative to also discuss about inconsistency with their previous observations on cell fate. In their previous report (reference 8), the authors concluded that without contractile forces, blastomeres adopt an inner-cell-like fate regardless of their position. This is clearly opposite of the phenotype of mzMyh9;mzMyh10 mutants, in which all the cells are specified to TE. Please add a discussion addressing this discrepancy.

      The data provided here are consistent with the ones from ref 8 (Maître et al, 2016): reduced contractility (Myh9 KO, double Myh9;Myh10 KO or Blebbistatin treatment) leads to reduced CDX2 levels. In ref 8, CDX2 and YAP are checked at the 16-cell stage, before the definitive differentiation into TE and ICM, whereas here we present data at the mid-blastocyst stage (~64 cells). We had not checked SOX2 in ref 8 since it is not expressed at such early stage, so we cannot conclude about this marker.

      We want to clarify that, as stated in the manuscript, in mzMyh9;mzMyh10 KO we detect CDX2 in 5/7 embryos only and therefore not all cells are correctly specified into TE. However, SOX2 could be detected in the inner cell of the one embryo that produced an inner cell. We had not discussed this issue further since it is difficult to conclude much from such rare events and we would prefer to keep it as such.

      To strengthen our argument about reduced differentiation in NMHC mutant embryos, we now provide YAP immunostaining (Fig S4). YAP is correctly patterned in Myh10 mutants and shows slightly less defined nuclear localization in Myh9 mutants, in agreement with our previous observations on CDX2 in the present study and previous observations on YAP at the 16-cell stage (Maître et al 2016).

      Together, we can conclude that, at the 16-cell stage, when ICM fate is not engaged yet (no detectable SOX2 expression), “inhibition of contractility causes (…) blastomeres to become inner-cell-like with respect to (…) Yap localization and Cdx2 levels, despite their external position” (Maître et al, 2016). At the blastocyst stage embryos with chronically impaired contractility can succeed in some but not all cases to produce TE (this study). Between these two developmental stages, blastomeres are exposed to prolonged signals from the apical domain and can be strongly deformed by the growing lumen. Based on the literature (Hirate et al 2013, Dupont et al 2011), both of these stimuli could potentially favor YAP nuclear localisation despite low contractility.

      Throughout the paper, the description of gene and protein symbols should follow the rules of MGI's guidelines for nomenclature of genes (http://www.informatics.jax.org/mgihome/nomen/gene.shtml#gene_sym). Gene and allele symbols are italicized. Protein symbols use all uppercase letters and are not italicized.

      We have corrected this.

      Line 163. The term "contact angles" are used without any explanation or definition. The term should be introduced with a brief explanation in the text, preferably with a figure. It should help facilitate the understanding of the scientists working in different fields.

      We have labelled a contact angle on Fig 1A and specified this in the text and in the figure legend.

      Reviewer #1 (Significance (Required)): The importance of actomyosin contractility in compaction, cell polarization, cell positioning, and cell fate specification in preimplantation embryos has been reported by several groups, mostly using chemical inhibitors, except for the study cited in reference 8, in which chimeras of wild-type and mMyh9 mutant embryos were used. This is the first genetic analysis of the roles of actomyosin contractility in the development of preimplantation embryos. Thus, the major advancement of this study is the genetic dissection of the roles of actomyosin contractility in preimplantation mouse development, and clarifying the contribution of maternal/zygotic Myh9 and Myh10 genes. While the phenotypes of reduced compaction and blastomere contractility are consistent with those observed in previous studies, polarization and TE fate specification of the mutant cells appear inconsistent with the conclusions of previous inhibitor experiments, which show defects in polarization processes and fate specification to ICM. These are potentially important issues, but detailed analyses were not performed. The requirement of actomyosin contractility for the cytokinesis of preimplantation embryos is also a novel finding, although it is expected from studies conducted in other systems. Vacuole formation in single-celled mzMyh9;mzMyh10 mutants in a timely manner suggested that fluid accumulation is a cell autonomous process and that cell differentiation occurs independently of cell division. These are also novel findings, although the latter is somewhat expected from previous studies performed using cell number manipulated embryos. In summary, the conceptual advance offered by this study is small. However, this is a high-quality study and makes critical observations in the field of preimplantation mouse development. Scientists in the field of developmental biology, especially those working on preimplantation development, should be interested in this paper. My field of expertise is preimplantation development.

      We thank the reviewer for her/his appreciation of our work. We want to argue that we did perform a very detailed analysis of the development of the NMHC mutant embryos, with multiple quantitative image and data analyses to thoroughly and objectively characterise the phenotypes of these mutants. If by “detailed analysis”, the reviewer meant a molecular dissection of the phenotype, we argue that 1/ checking the end result (i.e. presence of TE and ICM markers, presence of polarised fluid transport) was sufficient to assess the functionality of biological processes without checking every steps of a signalling cascade; 2/ we now provide additional molecular information on the state of YAP and apico-basal polarisation (Fig S3-4).

      Reviewer #2 (Evidence, reproducibility and clarity (Required)): In this manuscript, Schliffka et al. report that maternally deposited Myh9 is the major NMHC in preimplantation embryonic morphogenesis and complete removal of both Myh9 and Myh10 caused severe cytokinesis failure similar to tissue culture cells. Interestingly, although the mutant embryos completely failed cytokinesis thus forming single-celled embryos, they initiated trophoblast gene expression and vacuolization (likely similar to blastocoel formation), suggesting that the timing of preimplantation developmental events is independent from cell number and morphogenetic events.

      We thank the reviewer for her/his appreciation of our work.

      Major comments Vacuolization in single-celled embryos is interesting. In the images, there looks to be two types of vacuoles, F-actin positive and negative. The authors speculate the similarity to blastocoel formation. To support this, it is important to stain them with some basolateral markers like Na+ ATPase, E-cadherin and B-catenin. It is also important to confirm if the apical domain is properly formed by staining the apical domain markers like aPKC and Pard6.

      We thank the reviewer for this suggestion. We now provide immunostaining of single Myh9 or Myh10 and double Myh9;Myh10 mutants for aPKC (PRKCz), Na/K ATPase (ATP1A1), Aquaporin-3 (AQP3), the best basolateral marker in our hands, which is also very relevant to fluid pumping, CDH1 and F-actin (Fig S3). We observe that these markers localise similarly in multiple-celled and single-celled embryos, suggesting that vacuoles de facto substitute for the basolateral compartment normally consisting of cell-cell contacts and the lumen. This suggests that the same machinery is at the origin of the fluid inside the lumen and inside vacuoles.

      Minor comments All gene names should be Italicized.

      We have corrected this.

      L157. Myh10 and Myh9 should be mMyh10 and mMyh9.

      We have corrected this.

      L294 1/8 embryos. What does this mean?

      This means this was observed in 1 embryo out of 8 in total.

      L333 6/25 embryos. Does this mean 6 out of 25 embryos combined all maternal double mutants?

      Precisely.

      L438-442. I do not find these embryos are similar to tetraploid embryos. I suggest to remove the sentences.

      We have removed the sentences.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)): This study investigates the roles of non-muscle myosin in development, reporting a requirement for maternal and zygotic Mhy9 and 10. Strengths of the study include robust genetic techniques, innovative nested imaging to visualize events over different timescales within the same embryos, and analysis of morphological as well as transcriptional/cell fate phenotypes. However, the somewhat superficial phenotype analysis limits the authors' ability to draw strong mechanistic conclusions about what is going on in these mutants. Is cell polarization normal? Is cell signaling (HIPPO signalling) normal?

      We thank the reviewer for carefully assessing our study. We argue that we have thoroughly characterized the phenotypes of the NMHC mutants, which allowed us to draw many important mechanistic conclusions (such as the ability of NMHC mutants to polarise, or to pump fluid in a cell autonomous manner). Each mutant embryo has been imaged at multiple time scales, stained and genotyped. The time-lapses and immunostaining have been extensively quantified using manual as well as automated methods such as particle image velocimetry. We also provided fusion experiments, which phenocopy some aspects of the mutants to provide evidence of the mechanisms causing the observed phenotype.

      Nevertheless, we agree that one can always do more and that we had focused on the biological processes (lineage specification, morphogenesis and cleavages) rather than molecular characterisation. Although polarised fluid pumping ascertains a functioning epithelial polarity, we now provide immunostaining of polarity markers in mutant embryos. Although CDX2 and SOX2 staining inform on the output of the signalling cascade leading to effective TE and ICM differentiation, we now provide YAP immunostaining of mutant embryos. We hope this satisfies the request from the reviewer.

      What determines whether an embryo can form an inside cell or not?

      This is an outstanding question. Cells can internalise by oriented cell division or contractility mediated cell sorting. Contractility-mediated internalisation functions with only 2 cells (as when doublets of 16-cell stage blastomeres form a cell-in-a-cell structure) but requires to grow above a tension asymmetry threshold (of 1.5 in WT and most likely above 3 in these mutants due to their poor compaction, see Maître et al 2016). Oriented cell division only works if there is a cell-cell contact to push dividing cells in between. Therefore, at least 3 cells are required for an inner cell to be internalised by this mechanism.

      In double mutants, the average cell number is 2.9. No embryo consisting of only 2 cells contained an inner cell, about half of embryos with 3-5 cells contained a single inner cell and all embryos with 6 cells or more contained inner cells (Fig 4D). Based on the low contractility of double mutants, we can speculate that they do not succeed in overcoming the tension asymmetry threshold. This would explain why no inner cell is observed in embryos with only 2 cells. We can speculate that with 3-5 cells, oriented divisions could occur thanks to the presence of functional polarity (Korotkevitch et al 2017).

      We have added a discussion about this important matter.

      Similarly, the manuscript would benefit from rewriting to reframe the authors' discoveries within the context of what is known regarding lineage specification (e.g., why does CDX2/SOX2 expression indicate normal lineage specification). Additional minor comments are listed below.

      We elaborate on these points.

      **Minor comments:** • Introduction focuses overly on the work of the PI and his mentor, giving the presentation an unnecessarily biased quality.

      We have corrected this to the best of our ability. Please note that, to our knowledge, there are 8 studies (Anani et al., 2014; Maître et al., 2015; Samarage et al., 2015; Maître et al., 2016; Zhu et al., 2017; Zenker et al., 2018; Chan et al., 2019; Dumortier et al., 2019) looking in more or less details into the contractility of the preimplantation embryo. We mention and cite all of these studies.

      • The text asserts that Myh9 levels are highest during zygote stage, on the basis of qPCR (Fig. S1A), and that this is also observed by RNA-seq (Fig. S1B). However, this conclusion is not supported by the data shown.

      We have corrected this.

      • Would be nice to repeat the qPCR on the mz null.

      We agree with the referee that this would help in assessing the level of compensation between NMHC paralogs in individual mutants. Our qPCR protocol requires a few tens of embryos to be able to amplify the different paralogs. Unfortunately, pooling embryos from our current mating strategy would result in pooling homozygote and heterozygote mutants as we cannot know a priori which embryo is of which genotype.

      We believe that, as nice as this information would be, the current study does not require this information, which would be technically challenging.

      • Were the measurements shown in Fig. S1F taken from the images shown in Fig. S1E? If so, the authors should clarify how the measurements were normalized, since the images in Fig. S1E were clearly taken with different camera settings (as judged by background fluorescence level surrounding the embryos).

      The camera settings were identical but the LUT are set differently (to the maximal signal of a given genotype) so that some signal is visible. The signal intensities are so different between genotypes that if set to a common LUT, we either get the maternal GFP as a saturated white circle or the other genotypes as black images. We explain our LUT settings both in the methods and figure legends.

      As an alternative to the current data presentation, we would be fine to have the same LUT for all images and show almost black images for WT and paternal GFP.

      • Can't really conclude that Myh9 is essential for compaction since compaction occurs (albeit abnormally) in the absence of Myh9 (line 177-178).

      Our statement is “we conclude that maternal Myh9 is essential for embryos to compact fully”. WT and mzMyh10 mutants increase their contact angles by 60° whereas mzMyh9 only grow by 30°. Double mutants compact less than single Myh9 mutants. Therefore, the compaction movement is halved in mzMyh9 and the residual weak compaction could be explained by compensation from Myh10. We stand by our statement.

      • Line 211: "observe" rather than "measure".

      We have corrected this.

      • If the embryos achieve proper ICM/TE ratio, in spite of having half the number of cells in the mutants, is that to be expected? Would/do halved embryos also possess the same ICM/TE ratio? Or is this outcome peculiar to the mutants?

      This is an interesting question on which we had not sufficiently elaborated. Our experiments with cell fusion at the 4-cell-stage (Fig. S5) produced embryos with reduced cell number. These resemble Myh9 mutant embryos in the aspect that they show a reduced cell number while maintaining the total embryonic cell mass. In both cases, the ICM/total cell ratio is similar to control embryos. This indicates a robust mechanism of ICM/TE ratio setting that is robust to the cell number change observed in the single mutant. We have added a discussion about this.

      • Line 222: what is the evidence that Cdx2 and Sox2 are TE and ICM markers?

      We have added references to the studies from Strumpf et al., 2005 and Avilion et al., 2003 to support these claims.

      • Is the reported reduction in CDX2 and SOX2 levels due to a stage-delay? What would the comparison look like in wt embryos with half as many cells? Timing of lumen formation may or may not indicate developmental timing...

      We address this point by fusing embryos to half the cell number and find that the fate marker levels are specifically affected as a result of mutation of Myh9 (Fig S5).

      We agree that the timing of lumen formation is unlikely to be a good reference for staging and we did not use this event. We do synchronise embryos based on lumen opening only when comparing lumen growth rate.

      • Line 240 - what was the correction on the multiple pairwise comparisons (multiple t tests)?

      To compare lumen growth rate, individual growth rates of mutants are compared to those of WT using Student’s t test. Growth rates are considered as normally distributed and independent (not pairwise).

      • Lumen forms on time in mutants, despite having fewer cells. Alternatively, lumen forms early, prior to acquisition of proper cell number. Is there a reason the authors did not consider this alternative?

      The referee is correct. Lumens form with fewer cells in mutant embryos and therefore prior to the acquisition of proper cell number.

      • Lines 306 and 339: why does lack of SOX2 expression suggest that the lineage specification program is intact? Why does expression of CDX2 suggest TE initiation has occurred normally? The regulation of these two markers was not introduced.

      We have better introduced and justified this aspect.

      • Line 349: why is blastocoel formation a cell-autonomous property when it clearly occurs extracellularly? Does this also happen in wild type embryos?

      Blastocoel formation is clearly a multi-cellular process. We argue that fluid accumulation is not. The implications for WT embryos are that fluid can be accumulated in the blastocoel entirely trans-cellularly (no need for fluid to flow through cell-cell junction).

      • Speculate in Discussion on why the ML-7/Blebbistatin experiments results could differ from the genetic results produced here.

      Blebbistatin experiments are in agreement with the mutant data. ML-7 experiments are partially in agreement with the mutant data. The discrepancy lies in the effect on cell polarity. ML-7 affects kinases other than the MLCK, such as PKC, which is a known regulator of cell polarity during preimplantation development. Although this is speculative, we specify this in the revised manuscript.

      • Can these mutant embryos implant?

      We grow colonies of heterozygous mutants, therefore mMyh9, mMyh10 and mMyh9;mMyh10 embryos are viable and must be able to implant. As for homozygous mutants, they are not viable and we do not know whether they can implant.

      Reviewer #3 (Significance (Required)): The study provides the first strong evidence of a requirement for non-muscle myosin in epithelialization. This is significant to embryology and to epithelial biology.

      We thank the reviewer for appreciating the significance of our study. We want to clarify that our study provides evidence for NMHC as NOT being required for de novo epithelialization.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      This study investigates the roles of non-muscle myosin in development, reporting a requirement for maternal and zygotic Mhy9 and 10. Strengths of the study include robust genetic techniques, innovative nested imaging to visualize events over different timescales within the same embryos, and analysis of morphological as well as transcriptional/cell fate phenotypes. However, the somewhat superficial phenotype analysis limits the authors' ability to draw strong mechanistic conclusions about what is going on in these mutants. Is cell polarization normal? Is cell signaling (HIPPO signalling) normal? What determines whether an embryo can form an inside cell or not? Similarly, the manuscript would benefit from rewriting to reframe the authors' discoveries within the context of what is known regarding lineage specification (e.g., why does CDX2/SOX2 expression indicate normal lineage specification). Additional minor comments are listed below.

      Minor comments:

      • Introduction focuses overly on the work of the PI and his mentor, giving the presentation an unnecessarily biased quality.

      • The text asserts that Myh9 levels are highest during zygote stage, on the basis of qPCR (Fig. S1A), and that this is also observed by RNA-seq (Fig. S1B). However, this conclusion is not supported by the data shown.

      • Would be nice to repeat the qPCR on the mz null.

      • Were the measurements shown in Fig. S1F taken from the images shown in Fig. S1E? If so, the authors should clarify how the measurements were normalized, since the images in Fig. S1E were clearly taken with different camera settings (as judged by background fluorescence level surrounding the embryos).

      • Can't really conclude that Myh9 is essential for compaction since compaction occurs (albeit abnormally) in the absence of Myh9 (line 177-178).

      • Line 211: "observe" rather than "measure".

      • If the embryos achieve proper ICM/TE ratio, in spite of having half the number of cells in the mutants, is that to be expected? Would/do halved embryos also possess the same ICM/TE ratio? Or is this outcome peculiar to the mutants?

      • Line 222: what is the evidence that Cdx2 and Sox2 are TE and ICM markers?

      • Is the reported reduction in CDX2 and SOX2 levels due to a stage-delay? What would the comparison look like in wt embryos with half as many cells? Timing of lumen formation may or may not indicate developmental timing...

      • Line 240 - what was the correction on the multiple pairwise comparisons (multiple t tests)?

      • Lumen forms on time in mutants, despite having fewer cells. Alternatively, lumen forms early, prior to acquisition of proper cell number. Is there a reason the authors did not consider this alternative?

      • Lines 306 and 339: why does lack of SOX2 expression suggest that the lineage specification program is intact? Why does expression of CDX2 suggest TE initiation has occurred normally? The regulation of these two markers was not introduced.

      • Line 349: why is blastocoel formation a cell-autonomous property when it clearly occurs extracellularly? Does this also happen in wild type embryos?

      • Speculate in Discussion on why the ML-7/Blebbistatin experiments results could differ from the genetic results produced here.

      • Can these mutant embryos implant?

      Significance

      The study provides the first strong evidence of a requirement for non-muscle myosin in epithelialization. This is significant to embryology and to epithelial biology.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      In this manuscript, Schliffka et al. report that maternally deposited Myh9 is the major NMHC in preimplantation embryonic morphogenesis and complete removal of both Myh9 and Myh10 caused severe cytokinesis failure similar to tissue culture cells. Interestingly, although the mutant embryos completely failed cytokinesis thus forming single-celled embryos, they initiated trophoblast gene expression and vacuolization (likely similar to blastocoel formation), suggesting that the timing of preimplantation developmental events is independent from cell number and morphogenetic events.

      Major comments

      Vacuolization in single-celled embryos is interesting. In the images, there looks to be two types of vacuoles, F-actin positive and negative. The authors speculate the similarity to blastocoel formation. To support this, it is important to stain them with some basolateral markers like Na+ ATPase, E-cadherin and B-catenin. It is also important to confirm if the apical domain is properly formed by staining the apical domain markers like aPKC and Pard6.

      Minor comments

      All gene names should be Italicized.

      L157. Myh10 and Myh9 should be mMyh10 and mMyh9.<br> L294 1/8 embryos. What does this mean?

      L333 6/25 embryos. Does this mean 6 out of 25 embryos combined all maternal double mutants?

      L438-442. I do not find these embryos are similar to tetraploid embryos. I suggest to remove the sentences.

      Significance

      In this manuscript, Schliffka et al. report that maternally deposited Myh9 is the major NMHC in preimplantation embryonic morphogenesis and complete removal of both Myh9 and Myh10 caused severe cytokinesis failure similar to tissue culture cells. Interestingly, although the mutant embryos completely failed cytokinesis thus forming single-celled embryos, they initiated trophoblast gene expression and vacuolization (likely similar to blastocoel formation), suggesting that the timing of preimplantation developmental events is independent from cell number and morphogenetic events.

      Major comments

      Vacuolization in single-celled embryos is interesting. In the images, there looks to be two types of vacuoles, F-actin positive and negative. The authors speculate the similarity to blastocoel formation. To support this, it is important to stain them with some basolateral markers like Na+ ATPase, E-cadherin and B-catenin. It is also important to confirm if the apical domain is properly formed by staining the apical domain markers like aPKC and Pard6.

      Minor comments

      All gene names should be Italicized.

      L157. Myh10 and Myh9 should be mMyh10 and mMyh9.<br> L294 1/8 embryos. What does this mean?

      L333 6/25 embryos. Does this mean 6 out of 25 embryos combined all maternal double mutants?

      L438-442. I do not find these embryos are similar to tetraploid embryos. I suggest to remove the sentences.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary:

      Previously, the authors showed the importance of contractile force in cell positioning and cell fate specification in preimplantation mouse development. In this study, the authors generated maternal-zygotic mutants of the non-muscle myosin-II heavy chain (NMHC) genes Myh9 and Myh10, and quantitatively analyzed their development using time-lapse microscopy and immunostaining. The authors first examined the expression of NMHCs. Myh9 and Mhy10 are present in preimplantation embryos, and Myh9 is maternally inherited. Single maternal-zygotic mutants of Myh9 or Myh10 revealed that maternal Myh9 plays a major role in actomyosin contractility. In maternal Myh9 mutants, compaction and contractility at the 8-cell stage were reduced. Maternal Myh9 mutants demonstrated a longer 8-cell stage, and mutant blastocysts had reduced cell numbers. Cell positioning was not affected; however, cell differentiation was slightly affected by reduced expression of TE and ICM markers. Maternal Myh9 mutants formed blastocoels, but lumen opening was observed earlier than that in wild-type embryos. In double maternal-zygotic mutants of Myh9 and Myh10, cytokinesis was severely affected. Nevertheless, TE fate was specified and embryos formed blastocoels. Interestingly, single-celled mutants swelled upon the formation of fluid-filled vacuoles in their cytoplasm. Similar TE fate specifications and cytoplasmic vacuoles were also observed with single-celled embryos produced by blastomere fusion. Based on these results, the authors concluded that maternal Myh9 is the major NMHC. However, Myh10 can significantly compensate for the loss of Myh9, and that cell fate specification and morphogenesis are independent of the success of cell division.

      Minor comments:

      Overall, the conclusions of this study are supported by high-quality data. However, I have a few minor concerns:

      1. Line 200~205. The authors showed the correlation between the cell number at the blastocyst stage and the 8-cell stage, and concluded that "the lengthened 8-cell stage of mzMyh9 is an important determinant of their reduced cell number at the blastocyst stage". This conclusion is not well supported because of several reasons. First, the timing of cell count is not clear. Cell number was compared at the blastocyst stage, but Figure 1c shows that mzMyh9 embryos initiate blastocoel formation earlier than wild-type embryos. Therefore, if cell count timing was determined based on the blastocyst morphology of the embryos, the timing of cell count (i.e., time after 3rd cleavage) for mzMyh9 mutants is earlier than that observed for wild-type embryos. This shorter culture time likely contributes to the reduced cell number of mzMyh9. Second, the authors only showed a correlation, and no experimental data supporting this conclusion were shown. If the cell number was counted at the same time after the 3rd cleavage, and if the authors' hypothesis is correct, then culturing mzMyh9 mutants for an additional three hour, which is the difference in the duration of the 8-cell stage, should make the cell numbers of mutants comparable to those of wild-type blastocysts.
      2. Discussion. In the paragraph starting from line 405, the authors discussed the inconsistencies in the observation of the phenotypes of mzMyh9 and mzMyh10 mutants with the conclusions of previous studies by others about cell polarization. It will be informative to also discuss about inconsistency with their previous observations on cell fate. In their previous report (reference 8), the authors concluded that without contractile forces, blastomeres adopt an inner-cell-like fate regardless of their position. This is clearly opposite of the phenotype of mzMyh9;mzMyh10 mutants, in which all the cells are specified to TE. Please add a discussion addressing this discrepancy.
      3. Throughout the paper, the description of gene and protein symbols should follow the rules of MGI's guidelines for nomenclature of genes (http://www.informatics.jax.org/mgihome/nomen/gene.shtml#gene_sym). Gene and allele symbols are italicized. Protein symbols use all uppercase letters and are not italicized.
      4. Line 163. The term "contact angles" are used without any explanation or definition. The term should be introduced with a brief explanation in the text, preferably with a figure. It should help facilitate the understanding of the scientists working in different fields.

      Significance

      The importance of actomyosin contractility in compaction, cell polarization, cell positioning, and cell fate specification in preimplantation embryos has been reported by several groups, mostly using chemical inhibitors, except for the study cited in reference 8, in which chimeras of wild-type and mMyh9 mutant embryos were used. This is the first genetic analysis of the roles of actomyosin contractility in the development of preimplantation embryos. Thus, the major advancement of this study is the genetic dissection of the roles of actomyosin contractility in preimplantation mouse development, and clarifying the contribution of maternal/zygotic Myh9 and Myh10 genes. While the phenotypes of reduced compaction and blastomere contractility are consistent with those observed in previous studies, polarization and TE fate specification of the mutant cells appear inconsistent with the conclusions of previous inhibitor experiments, which show defects in polarization processes and fate specification to ICM. These are potentially important issues, but detailed analyses were not performed. The requirement of actomyosin contractility for the cytokinesis of preimplantation embryos is also a novel finding, although it is expected from studies conducted in other systems. Vacuole formation in single-celled mzMyh9;mzMyh10 mutants in a timely manner suggested that fluid accumulation is a cell autonomous process and that cell differentiation occurs independently of cell division. These are also novel findings, although the latter is somewhat expected from previous studies performed using cell number manipulated embryos. In summary, the conceptual advance offered by this study is small. However, this is a high-quality study and makes critical observations in the field of preimplantation mouse development. Scientists in the field of developmental biology, especially those working on preimplantation development, should be interested in this paper. My field of expertise is preimplantation development.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      We thank Review Commons and its three reviewers for your supportive and insightful responses to our manuscript. Below, we provide detailed responses to the reviewers’ individual comments and how we plan to address them during the revision.

      Reviewer #1: **Major comments:**

      The manuscript is very well written. The data is clearly presented. The methods are explained in sufficient detail with a few exceptions mentioned below, and statistical analysis are adequate. There are some concerns and suggestions about the experimental design and data presentation.

      - Drug treatments. It is not clear whether the cells were previously grown on charcoal-stripped serum before hormone treatments. From methods, it seems they were grown in 5% FBS and directly treated with the hormones. Also, what "hormone-free medium" mean? Is it charcoal stripped Serum or not Serum at all?

      For all experiments, the cells were grown in medium containing 5% FBS. Throughout the manuscript, “hormone-free” refers to medium containing 5% FBS with no dexamethasone added. Technically, this medium is not hormone-free as FBS contains low levels of cortisol. However, the levels of cortisol from the FBS in our medium seems insufficient to elicit a transcriptional response or DNA binding by GR based on experiments comparing charcoal stripped and medium containing regular 5% FBS. However, we acknowledge that it should be made clear to the reader that growth conditions technically were not hormone-free. We will make sure to include this information in both the methods and results section of a revised manuscript. In addition, we will state explicitly that our naiive cells are those that have not been exposed to a high dose of hormone.

      Replicates for these data sets? The ATAC and Chip-Seq should have at least 2. The concordance of the ATAC-seq and Chip-seq replicates should be described and shown in supplemental figures.

      The ChIP-seq peaks for GR are the intersect of two biological replicates. This is described in the Methods section (page 7). For the ATAC data, we used two biological replicates for the vehicle treated cells and treated two different hormones (dexamethasone and cortisol) as replicates. In a revised manuscript, we will add a supplemental figure to show the concordance between the replicates.

      Fig1A - The ATAC-seq HM should be clustered to show which peaks in opening/closing and unchanged peaks also have called GR chip peaks. Showing browser shots as in Fig1B is cherry picking data and can be put in a supplementary figure as an example. This is a main point of emphasis of the manuscript so show the data. The atac peaks that do overlap with GR chip peaks should be sorted by GR peak intensity. The QPCR is then only needed to confirm the quantitative changes.

      This is a good idea. As suggested by this reviewer (and also in response to a comment by one of the other reviewers), we will revise this figure panel to make the overlap between GR binding and opening and closing sites more obvious. Here are the numbers:

      A549 cells:

      opening sites: 49%

      closing: 10%

      nonchanging: 18%

      U2OS cells:

      opening: 54%

      closing: 0.2%

      nonchanging: 7%

      Regarding the use of browser shots, obviously these are cherry picked examples, however in our opinion they serve a purpose beyond illustrating examples of individual loci that open or close as they also give the reader an idea of the quality of the ATAC-seq data.

      To show both the ATAC sites and H3K27ac sites are specific to hormone treatment, a random set of 15K peaks not in this peak set also should be shown in HMs and should not change with the treatments. Why does the H3K27ac go down in the 6768 non changing sites with dex?

      The proposed group of control peaks is essentially what we included as “non-changing” peaks. For the revision, we will refine this group and compare the H3K27ac signal between GR-occupied and non-occupied groups. Regarding reduced H3K27ac signal upon Dex treatment at non-changing sites: Notably, this comparison is based on a single ChIP-seq replicate. In our experience, ChIP-seq experiments show quite some variability between biological replicates, which limits our ability to compare signal levels quantitatively. Thus, the difference could simply reflect a difference in ChIP efficiency between the treated and untreated cells. Alternatively, it could be that there is a general redistribution of H3K27ac signal towards GR-occupied opening sites. To pin down which of these explanations is valid, we would need to perform additional experiments, e.g. using spike-ins. However, this is beyond what we can do at the moment and therefore, we will instead revise the text to make sure that the interpretation of these results is somewhat speculative.

      The D & E parts of Fig1 can then be eliminated to become parts of Fig1A. Its not clear in the text that the HMs in Fig1 are all sorted in the same way.

      We will revise figure 1 as requested. In our initial submission, the data was always sorted by signal intensity of the feature shown. We will revise this and sort by ATAC-signal and keep a consistent sorting order for other features shown (and stratify each group into GR-occupied or not).

      - Fig. 1b (and d). The ChIP data is from 3h-hormone treatment while the ATAC-seq data is from a 20h hormone treatment. It seems a bit misleading to directly compare GR occupancy with the state of the chromatin at different time windows. Shouldn't the authors show their ATAC-seq 4h treatment data (shown in Fig S1) here instead?

      We will revise the figures as suggested to show the same time point for ChIP and ATAC-seq data.

      - Fig. 1f. The authors state "downregulated genes only show a modest enrichment of GR peaks". However, there is a significant enrichment of GR-peaks in repressive genes compared to non-regulated genes. It would be interesting to see how some of these peaks look in a browser shot. While the general conclusion "transcriptional repression, in general, does not require nearby GR binding", seems valid, the observation that many GR peaks appear directly bound to nearby repressed genes ought to be more emphatically recognized in the text.

      This is a fair point and was also raised by the other reviewers. During the revision, we will make textual changes to acknowledge that GR binding is enriched near repressed genes, albeit less so than for activated genes. In addition, we will include genome browser shots of genes with nearby peaks that are repressed by GR.

      - Concept of naïve cells (Fig. 3A). If cells are normally grown in serum-containing media, which is known to have some level of steroids, can the cells described here as "Basal expression" be truly free of a primed state? In the first part of the experimental design (+/- 4h hormone), which type of media is present here? Is it 5% FBS? A concern is that the authors may require the assumption that the (4h + 24h) period a is sufficient to erase all memory of the cells, which is exactly what they are trying to test.

      See our response to the first major comment above.

      It would be interesting to do a time course of the hormone-free period of the washout to determine the memory of the chromatin environment that results in the enhanced transcriptional response instead of just 24 and 48 hrs in A549 cells.

      We agree that that would be interesting but this is something that we cannot include for now.

      Fig 5A appears to show H3K27ac overlaying H3K27me marks near the promoter of ZBTB16 and at the GR sites within the gene locus with no reduction in H3K27me levels. This seems counterintuitive and should be explained or addressed especially since the authors use quantitative comparisons of H3K27ac levels with and without treatment in other figures.

      A trivial explanation for the overlaying H3K27ac and H3K27me3 marks at the ZBTB16 locus is that the ChIP results represent a population average. From our single-cell FISH experiments, we found that only a subset of cells activates ZBTB16 expression upon hormone treatment. Thus, a potential explanation is that the cells of the population that respond are responsible for the H3K27ac signal whereas the non-responders are decorated with H3K27me3. We will include this information in a revised discussion.

      Showing the changes of ZBTB16 upon 2nd stimulation via FISH is not terribly surprising and is even the most expected reason for higher RNA levels. Why does it only occur at that gene is a better question and is touched on in the discussion. It is more likely that this gene has a very low level of pre-hormone transcription compared to FKBP5 (see Fig 3e and the FISH images). ZBTB16 is in the lower 3rd of basemean RNA levels of GR responsive genes according to the RNAseq data. Selection of 1 or 2 other genes with similar basemean levels of RNA (from the RNA-Seq data) would make the data more

      When compared to FKBP5, ZBTB16 indeed has very low levels of pre-hormone expression. However, this is unlikely to explain the observed “memory” for ZBTB16 given that there are other genes with similarly low pre-hormone levels that do not show more robust responses upon repeated hormone exposure (see Fig. 3B,D). For the FISH experiments, we decided to include a non-primed gene (FKBP5 as control). We agree that adding additional control genes with comparable basemean levels would be informative. For example, this would tell us if a response of only a subset of cells in the population to hormone is specific to ZBTB16. Based on single cell studies by others (PMID: 32170217), most GR target genes show a response in only a subset of cells indicating that this is unlikely a unique feature of ZBTB16 explaining the priming observed. Rather than performing additional experiments, we will revise the discussion to acknowledge the difference in basemean and the potential role of cell-to-cell variability in explaining the observed “memory” for the ZBTB16 gene.

      **Minor comments:**

      - In the Intro (paragraph two), the authors explain the different mechanisms by which GR might repress genes. One alternative the authors appear to have missed is the possibility of direct binding to GREs while, for example, recruiting a selective corepressor such as GRIP1 (Syed et al., 2020). There are many recent critics to the notion that transrepression via tethering is responsible for GR repressive actions at all (Escoter-Torres et al., 2020; Hudson et al., 2018; Weikum et al., 2017).

      We are aware of these studies and agree that they should be included when listing the possible mechanisms by which GR can repress genes. We will revise the text accordingly.

      - When the authors introduce the concept of tethering to AP-1, they go way back to the first description of tethering. However, one of the references (Ref 20) actually goes against the tethering model as they did not detect protein-protein interactions between AP-1 and GR, and also, they conclude that repression requires the DNA-binding domain.

      We will pick a more appropriate reference indicative of tethering as a mechanism by which GR might repress genes.

      -Figure 2. The authors state "This suggests that the few sites with persistent opening are likely a simple consequence of an incomplete hormone washout and associated residual GR binding". The authors should check the subcellular distribution of GR after their washout protocol. If the washout is not completed, GR should still be in the nuclear compartment.

      The careful phrasing here was to include the possibility that GR might bind DNA even when hormone is completely washed out. However, a more likely explanation is that the washout is incomplete. The residual GR binding we find in our ChIP assays shows us that a subset of GR is indeed still chromatin-bound which implies that some GR is still in the nuclear compartment.

      - The first part of the manuscript (Repression through "squelching") seems a bit disconnected from the rest of the results (reversibility in accessibility). The abstract is structured in a way that this disconnection seems much less obvious. Perhaps the authors could try to present their squelching part in the middle of the manuscript, following the flow of the abstract? This is just a suggestion.

      When revising the manuscript, we will see if implementiung this suggestion is feasible.

      - Figures have CAPS panel letters (A,B,C, etc) while the text calls for lower case letter (a,b,c...)

      We will fix this as part of the revision. Reviewer #2: **Major Comments**

      We agree that long-term and repeated GC treatment would be very interesting to study and would yield insights that are more likely to be relevant to, for example, emerging GC-resistance during therapeutic use. We are aware of the limitations of our study and will make sure that these are acknowledged in the revised manuscript and we will point out the speculative nature of translating our findings to an in-vivo setting.

      2a.) The authors show several heatmaps to indicate changes in accessibility, H3K27ac and P300 upon Dex treatment as well as GR binding patterns in Fig. 1 and S1. Those are sorted by decreasing signal strength (I assume). To make those results more comparable, I suggest to sort them all in the same way (e.g. by descending ATAC-Seq signal or fold-change).

      A similar suggestion was made by reviewer 1. We agree that using the same sort order for the datasets makes it easier to link the different types of data we generated. We will present the data with a consistent sorting order and stratified by GR-occupied or not when we revise the manuscript.

      2b.) In line with a.), it is unclear to the reader if those sides opening /closing are the same sides showing increased/decreased H3K27ac or P300 occupancy and if those sides bind GR. Integrating this data together with mRNA e.g as correlation plots would strengthen the author's argument that accessibility, H3K27ac and mRNA changes are indeed correlated. What about the GR binding sites that do not change accessibility or H3K27ac? What makes those different? ** Therefore, the statement "Furthermore, closing peaks, which show GC-induced loss of H3K27ac levels and lack GR occupancy (Fig. S1c-f), were enriched near repressed genes" on page 10 as well as the statement "suggesting that transcriptional repression by GR does not require nearby GR binding." in the abstract and discussion cannot be made from how the data is presented.

      The first issue raised will be addressed by using the same sort order across different types of data. It might also shed light on features associated with GR binding sites that do not change accessibility or H3K27ac. Once we implement the revised sorting order, we will evaluate if the statements mentioned are indeed supported by the data.

      2c.) Several recent studies have shown that GR's effect on gene expression and chromatin modification at enhancers might be locus-/context-specific ("tethering", competition, composite DNA binding) and/or recruitment of different co-regulators (see Sacta et al. 2018 (doi: 10.7554/eLife.34864), Gupte et al. 2013 (doi.org/10.1073/pnas.1309898110) and many more). Defining the GR-bound or opening/closing sides in terms of changing H3K27ac (or having H3K27ac or not) more closely would help to link those to gene expression changes e.g. in violin plots. Furthermore, the authors could include a motif analysis to see if the different enhancer behaviours can be explained by differences in the GR motif sequence or co-occurring motifs. Thereby more closely defining the mechanism of chromatin closure a sites that lack GR binding e.g. by displacement of other transcription factors as described for p65 in macrophages (Oh et al. 2017 (doi.org/10.1016/j.immuni.2017.07.012)). In general a more detailed analysis of the data is required before the authors could state "Instead, our data support a 'squelching model' whereby repression is driven by a redistribution of cofactors away from enhancers near repressed genes that become less accessible upon GC treatment yet lack GR occupancy." on page 10. The results might also be explained by competitive transcription factor binding, tethering or selective co-regulator recruitment (e.g. HDACs).

      We will include a motif analysis comparing opening, closing and non-changing sites (stratified into GR-occupied or not) in a revised version of the manuscript. In addition, we will further investigate the redistribution of p300 upon Dex-treatment e.g. to test the correlation between p300 loss at closing sites lacking GR occupancy and transcriptional repression. We agree that the “squelching model” is just one of several explanations for repression and will provide a more comprehensive list of possible explanations beyond squelching as part of the revision.

      We will discuss the difference in receptor levels between the cell lines, the different number of genomic GR binding sites and its possible implication in the observed residual binding after wash-out in U2OS-GR cells as suggested.

      We agree that the coverage plots do not take the fraction of binding sites with signal into account. However, by also showing the heat maps, this information is also available to the reader. In our opinion, the coverage plots provide a straight-forward way to compare the signal for the different categories of peaks. The violin plots are an interesting alternative way to present the data, which also captures the diversity in the signal within each group. We will add violin plots to the supplementary data as requested.

      We see your point. However, based on the ATAC-signal (Fig. 5D) the changes in nucleosomal occupancy upon GC treatment are the same for naiive and primed cells and revert to their base-line level after hormone withdrawal. This indicates that these loci have comparable nucleosome occupancy after wash-out. Yet, the levels for these histone modifications do not differ between primed and naiive cells indicating that these histone marks do not “mark” the promoter of primed genes after wash-out.

      We are reluctant to put p-values on every chart, especially for experiments with few replicates. Importantly, we always plot the values for each individual data point, so the reader can gage if they differ between conditions. We will add p-values for figure 4 to test (support) our claim that ZBTB16 is primed whereas other GR target genes are not.

      A similar suggestion was brought up by reviewer #1, here is the response we gave to this comment: When compared to FKBP5, ZBTB16 indeed has very low levels of pre-hormone expression. However, this is unlikely to explain the observed “memory” for ZBTB16 given that there are other genes with similarly low pre-hormone levels that do not show more robust responses upon repeated hormone exposure (see Fig. 3B,D). For the FISH experiments, we decided to include a non-primed gene (FKBP5 as control). We agree that adding additional control genes with comparable basemean levels would be informative. For example, this would tell us if a response of only a subset of cells in the population to hormone is specific to ZBTB16. Based on single cell studies by others (PMID: 32170217), most GR target genes show a response in only a subset of cells indicating that this is unlikely a unique feature of ZBTB16 explaining the priming observed. Rather than performing additional experiments, we will revise the discussion to acknowledge the difference in basemean and the potential role of cell-to-cell variability in explaining the observed “memory” for the ZBTB16 gene.

      The fact that we do not observe elevated expression of other genes upon repeated expression could be due to the relatively short length of the hormone treatment, 4 hours, which was chosen to enrich for direct target genes of GR. These four hours might be insufficient for transcription, translation and ultimately gene regulation by the ZBTB16 protein. We have not looked at ZBTB16 protein levels.

      **Minor Comments**

      We will include this information in a revised version of the manuscript.

      We will add the requested peak-centric view. Based on a previous study (PMID: 29385519), we expect that binding is a poor predictor of gene regulation of nearby genes, especially for repressed genes.

      In our analysis, we looked at opening and closing peaks independently. If a peak is in the vicinity of multiple genes, it will only be assigned to the closest one. Thus, genes that have both and opening and a closing peak in the 50kb window will be included in both the analysis of closing sites and opening sites. We have not looked at clusters of binding sites, but agree that this would be interesting to see if the combinatorial action of multiple peaks makes regulation of the gene more likely. We will look into this during the revision process.

      1. The authors claim on p10 that "We could validate several examples of opening and closing sites and noticed that opening sites are often GR-occupied whereas closing sites are not occupied by GR". As most of the ChIP-Seq experiments were performed on formaldehyde-only fixed cells, the authors might miss "tethered" sides, which are mostly linked to gene repression. You might rephrase this part to most closing sites lack direct DNA binding.

      Even though several studies indicate that tethered binding can be captured using formaldehyde-only fixed cells (e.g. PMID: 32619221, PMID: 15879558), we agree that the ChIP-assay might have blind spots, for instance for tethered binding, and will revise our statements as suggested.

      This might be related to comment #4 given that P300 is brought to the DNA by other transcription factors whereas H3K27ac is directly DNA-bound which likely influences the cross-linking efficiency. By resorting the heat-maps, we will be able to determine the overlap between p300 recruitment and changes in H3K27ac levels (the other main enzyme that deposits this mark is CREBBP (a.k.a. CBP)).

      We will include this information in a revised version of the manuscript.

      We have not looked into this but a previous study by the Reddy lab (PMID: 22801371) has investigated binding sites in A549 cells that are occupied at very low Dex concentrations. They found that this is not driven by a specific GR motif but rather by the presence of binding sites for other transcription factors and chromatin accessibility.

      This data for the GILZ gene is shown in Figure S2C. When we revise the manuscript we will add this information to main figures 1 and 2 as suggested.

      This is shown in figure S3C and shows that expression levels of certain genes (ZBTB16 and FKBP5 but not GILZ) stay high after Dex washout (but not cortisol wash-out) consistent with persistent GR binding at a subset of GR-occupied loci for the experiments using Dex.

      For both S2C and S3C, cells were treated for 4h with Dex before the wash-out. For the ZBTB16 and FKBP5 genes, the persistent GR binding after wash-out is accompanied by a preserved Dex response after wash-out. For GILZ, GR binding at one of the peaks near the GILZ gene is also preserved, yet the expression of this gene reverses to its pre-treatment levels after wash-out. A possible explanation is that the residual binding at the GILZ gene is observed for only one of several nearby GR peaks. Previous studies, where we deleted GR binding sites near the GILZ gene, have shown that the combined action of multiple GR-occupied regions is needed for robust induction of this gene (PMID: 29385519).

      A trivial explanation for the overlaying H3K27ac and H3K27me3 marks at the ZBTB16 locus is that the ChIP data represents a population average. From our single-cell FISH experiments, we found that only a subset of cells activates ZBTB16expression upon hormone treatment so a potential explanation is that the cells of the population that respond are responsible for the H3K27ac signal whereas the non-responders are decorated with H3K27me3. We will include this information in a revised discussion. On a single histone, H3K27me3 and H3K27ac are mutually exclusive. However, given that a nucleosome has 2 copies of histone H3, both modifications can in principle co-exist.

      We’re guessing here, but we assume the reviewer refers to the potentially slightly higher H3K27me3 levels upon Dex treatment for ChIP-seq whereas the qPCR indicates that the levels do not change? The change seen in the ChIP-seq experiment is marginal and based on a single experiment. In contrast, the qPCR data shows the results from three biological replicates and therefore is probably a more reliable source of information.

      We will include this information in a revised version of the manuscript.

      Cancer cell lines often have variable karyotypes and our FISH data suggests that the ZBTB16 locus is present in more than 2 copies in some of the A549 cells. Here’s the info from the ATCC website describing the karyotype of A549 cells: …” This is a hypotriploid human cell line with the modal chromosome number of 66, occurring in 24% of cells. Cells with 64 (22%), 65, and 67 chromosome counts also occurred at relatively high frequencies; the rate with higher ploidies was low at 0.4%.....”.

      Upon quick inspection, we find that GR target genes are typically not marked by H3K27me3, however ZBTB16 does not appear to be the only one. When we revise the manuscript, we will look more systematically at the link between gene regulation by GR and genes marked by H3K27me3 to determine how “special” the presence of this mark is, which will also inform us about the likelihood that it is linked to the transcriptional memory observed for the ZBTB16 gene.

      We are not sure if ZBTB16 regulation by GR is tissue independent. However, in contrast to most GR target genes that are regulated in a cell-type-specific manner, ZBTB16 is regulated in both cell lines we examined and has also been reported to be a GR target gene in other cell types e.g. in macrophages (PMID: 30809020).

      Reviewer #3 **Major Comments:**

      For sure the washout time matters and we do not doubt that the persistent changes observed upon shorter wash-out by the Hager lab are real. One of the reasons we chose the 24h period was to see if the changes observed by Lightman and Hager might persist for extended periods of time as suggested by Zaret and Yamamoto. Our findings suggest that this is not the case and that the majority of GR-induced changes are short-lived. Perhaps future studies can shed light on how long changes persist. However, given the slow dissociation between GR and Dex, we expect that it might be hard to dissect if persistent changes are indeed persisting in the absence of GR binding or reflect an incomplete hormone wash-out.

      The objective of this study was to find out if persistent changes as observed in Ref33 are the exception or the rule not to test if the original observation is correct (importantly, another cell line was used in Ref33 which makes a 1:1 comparison impossible to begin with). We believe that we have convincingly shown that, for the cell lines we assayed, persistent changes are rare if occurring at al. Given that no convincing persistent changes were observed after a 24h washout, we think that it is very unlikely that such changes would be observable after even longer wash-out periods. We do not intend to include experiments using longer wash-out but will revise the discussion to emphasize that the lack of persistent changes we found might be specific to the cell lines we chose for our studies.

      We agree that adding this percentage is a good idea as this would allow for a more quantitative comparison between the different groups. Here are the numbers:

      A549 cells:

      opening sites: 49%

      closing: 10%

      nonchanging: 18%

      U2OS cells:

      opening: 54%

      closing: 0.2%

      nonchanging: 7%

      We will include this information in a revised version of the manuscript.

      For the ATAC-seq experiments, we treated the dex-treated and cort-treated experiments as replicates to find candidate regions with persistent chromatin changes. For the ATAC-seq data, a site is 'persistent' if called (by MACS2, e.g. DEX vs EtOH) upon treatment and then again 24h after washout (DEX washout vs EtOH washout). For the ATAC-qPCR experiments, we performed 4 biological replicates and will perform a t-test to determine if the small difference we observe at some sites between the EtOH and washout is statistically significant. Given the overlapping error bars and the very small difference, don’t expect the difference to be significant even for these most promising candidates from our genome-wide analysis.

      Indeed we did not find a mechanistic explanation for the ZBTB16-specific memory. Possible explanations are discussion in the following section of the results (page 14-15): “… Mirroring what we say in terms of chromatin accessibility, transcriptional responses also seem universally reversable with no indication of priming-related changes in the transcriptional response to a repeated exposure to GC for any gene with the exception of ZBTB16. Although several changes in the chromatin state occurred at the ZBTB16 locus, none of these changes persisted after hormone washout arguing against a role in transcriptional memory at this locus (Fig. 5). Similarly, the increased long-range contact frequency between the ZBTB16 promoter region and a GR-occupied enhancer does not persist after washout (Fig. 5e). Notably, our RNA FISH data showed that ZBTB16 is only transcribed in a subset of cells, hence, it is possible that persistent epigenetic changes occurring at the ZBTB16 locus also only occur in a small subset of cells and could thus be masked by bulk methods such as ChIP-seq or ATAC-seq. Another mechanism underlying the priming of the ZBTB16 gene could be a persistent global decompaction of the chromatin as was shown for the FKBP5 locus upon GR activation [35]. Likewise, sustained chromosomal rearrangements, which we may not capture by 4C-seq, could occur at the ZBTB16 locus and affect the transcriptional response to a subsequent GC exposure. Furthermore, prolonged exposure to GCs (several days) can induce stable DNA demethylation as was shown for the tyrosine aminotransferase (Tat) gene [71]. The demethylation persisted for weeks after washout and after the priming, activation of the Tat gene was both faster and more robust when cells were exposed to GCs again [71]. Interestingly, long-term (2 weeks) exposure to GCs in trabecular meshwork cells induces demethylation of the ZBTB16 locus raising the possibility that it may be involved in priming of the ZBTB16 gene [72]. However, it should be noted that our treatment time (4 hours) is much shorter. Finally, enhanced ZBTB16 activation upon a second hormone exposure might be the result of a changed protein composition in the cytoplasm following the first hormone treatment. In this scenario, increased levels of a cofactor produced in response to the first GC treatment would still be present at higher levels and facilitate a more robust activation of ZBTB16 upon a subsequent hormone exposure. Although several studies have reported gene-specific cofactor requirements [73], the 14 fact that we only observe priming for the ZBTB16 gene would make this an extreme case where only a single gene is affected by changes in cofactor levels……”.

      **Minor Comments**

      We will include a motif analysis for opening and closing sites in a revised version of the manuscript.

      We will revise the label in a revised version of the manuscript as suggested.

      We actually prefer the MA plots as they also provide information regarding the basemean counts for regulated genes. This allows one, for example, to see that other GR-regulated genes with similar basemean counts do not show a “memory” suggesting that the low expression level for ZBTB16 likely does not explain the observed priming.

      We will include this information in a revised version of the figure.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Summary:

      In this study, Bothe et al investigated the functional role of glucocorticoids on gene regulation and chromatin accessibility using ATAC-seq, RNA-seq data in A549 and U2OS-GR cells under different conditions. They focused on transcriptional memory of GR activation, meaning a more robust transcriptional response upon repeated hormone stimulation (transcriptional memory). A previous study reported persistent changes after more than 9 days, but here the authors focused on a 24-hour washout period which they reasoned would be more likely to reveal persistent changes. However, they found that only identified a single gene, ZBTB16, with this characteristic. The studies are well performed, but the reader is left confused as to whether the difference between the present and previous result is a timing issue. That needs to be addressed. In parallel, the authors found that chromatin accessibility was also reversible after hormone withdrawal. This was also true for the ZBTB16 gene and thus could not explain the transcriptional memory for this gene. The authors suggest that priming increases ZBTB16 output by increasing the fraction of cells responding to hormone treatment as well as by augmenting activation by individual cells. This is interesting, but the reason why the ZBTB16 gene is special is not explained. Moreover, since ZBTB16 was the only gene where hormone-induced changes were not reversible, the conclusion that "hormone can induce gene-specific changes in the response to subsequent exposures which may play a role in habituation to stressors and changes in glucocorticoid sensitivity" seems an overstatement especially since the title states that changes are universally reversible. Also the discussion ends by arguing that it is still likely that individual cells remember previous hormone exposure, even though the present paper argues strongly against that except for a single gene where the mechanism of the memory is completely unclear. This discrepancy between what "might be the case" and what the authors actually observe needs to be corrected.

      Major Comments:

      1. Although the authors reported that only ZBTB16 displayed transcriptional memory, would more genes emerge with less stringent cutoffs, for example Fold Change> 1.5 & adjusted p value < 0.05?
      2. One question the authors should consider is whether the washout time matters. What if it were reduced to a shorter time, for example 8 or 12 hrs? This might especially alter the conclusions about dexamethasone, which Lightman and Hager have suggested to have a long half life of binding to the GR in cells.
      3. The authors point out that Ref. 33 focused on persistent changes after more than 9 days, but the authors state that they focused on a 24-hour washout period which they reasoned would be more likely to reveal persistent changes. However, that was not the case, and the present findings seem to be at odds with the conclusion drawn in Ref. 33. This begs the question of whether the original report was correct and authors would have seen persistent changes (by whatever mechanism) after 9 days, or whether there almost no persistent changes at all as the present study would suggest. To address this and advance the field on this point, it is imperative that the authors do the "positive control" of repeating the protocol used in the original report, to determine if the difference is quantitative (timing) or qualitative (true discrepancy between two groups).
      4. The authors state that "opening sites are often GR-occupied whereas closing sites are not occupied by GR" in Figs 1B and C. What is the fraction of opening sites with GR binding?
      5. In Fig. 2C the authors show SLC9A8 as an example of a gene which maintained a reduced level of open chromatin when assessed by ATAC-seq. To "validate" this they performed ATAC-PCR, and in the results shown in Fig. 2D any differences were not found to be statistically significant. However, these are two different assays, and both have potential flaws and experimental error. Were biological replicates of ATAC-PCR performed, and if so were the differences in ATAC-seq signal between EtOH and washout statistically significant? And is this true of other genes with similar patterns, such as FKBP5, PTK2B, and others?
      6. The authors suggest that priming increases ZBTB16 output by increasing the fraction cells responding to hormone treatment, but no explanation was found to explain why this happens to ZBTB16 but not all of the other GC-induced genes. This needs to be discussed.

      Minor Comments

      1. What motifs are enriched at the ATAC sites that open and close?
      2. Fig. 1F would be improved by rephrasing the labels using terms "without site/peak" and "with site/peak". Otherwise, readers may think they are all GR peaks.
      3. For Figs. 3B-3D, volcano plots are a better way to present the differentially expressed genes.
      4. p values should be shown in Figs. 6C, 6D, 6F and 6G.

      Significance

      This is important since glucocorticoids are important hormones and drugs. A limitation of this study is that it's all in cell lines. One concern is that the conclusions differ from those in reference 33 and that will minimize the significance unless this is addressed by the authors, for example by using the 9 day protocol that was used in ref 33 to determine if those results are reproducible.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary

      Bothe and colleagues studied the effect of repeated glucocorticoid exposure on DNA accessibility and gene expression in A549 and U2OS-GR cells and show that most of the glucocorticoid receptor (GR) induced changes are reversible in both cell lines and after long-term (20 hrs) and short-time (4 hrs) dexamethasone (Dex) or cortisone treatment. They identified a single gene that seem to have persisting memory of previous Dex exposure, namely ZBTB16.

      Major Comments

      1. The authors used the cancer cell lines A549 and U2OS-GR as model systems the latter additionally overexpresses GR. In order to make the work more translatable an in-vivo model comparing the effect of long-term, short-term and repeated glucocorticoid (GC) treatment on DNA accessibility and gene expression is necessary. The authors should clearly emphasizes this limitation of their study in the discussion or add in-vivo data (e.g. qPCRs) to strengthen the translatability.
      2. The authors draw conclusions of the association of DNA accessibility, H3K27ac, P300 and GR occupancy from independent heatmaps. This cannot be easily done from the current way the data is presented. A direct link between accessibility, H3K27ac and mRNA expression of the associated gene for example is missing. a.) The authors show several heatmaps to indicate changes in accessibility, H3K27ac and P300 upon Dex treatment as well as GR binding patterns in Fig. 1 and S1. Those are sorted by decreasing signal strength (I assume). To make those results more comparable, I suggest to sort them all in the same way (e.g. by descending ATAC-Seq signal or fold-change). b.) In line with a.), it is unclear to the reader if those sides opening /closing are the same sides showing increased/decreased H3K27ac or P300 occupancy and if those sides bind GR. Integrating this data together with mRNA e.g as correlation plots would strengthen the author's argument that accessibility, H3K27ac and mRNA changes are indeed correlated. What about the GR binding sites that do not change accessibility or H3K27ac? What makes those different? Therefore, the statement "Furthermore, closing peaks, which show GC-induced loss of H3K27ac levels and lack GR occupancy (Fig. S1c-f), were enriched near repressed genes" on page 10 as well as the statement "suggesting that transcriptional repression by GR does not require nearby GR binding." in the abstract and discussion cannot be made from how the data is presented. c.) Several recent studies have shown that GR's effect on gene expression and chromatin modification at enhancers might be locus-/context-specific ("tethering", competition, composite DNA binding) and/or recruitment of different co-regulators (see Sacta et al. 2018 (doi: 10.7554/eLife.34864), Gupte et al. 2013 (doi.org/10.1073/pnas.1309898110) and many more). Defining the GR-bound or opening/closing sides in terms of changing H3K27ac (or having H3K27ac or not) more closely would help to link those to gene expression changes e.g. in violin plots. Furthermore, the authors could include a motif analysis to see if the different enhancer behaviours can be explained by differences in the GR motif sequence or co-occurring motifs. Thereby more closely defining the mechanism of chromatin closure a sites that lack GR binding e.g. by displacement of other transcription factors as described for p65 in macrophages (Oh et al. 2017 (doi.org/10.1016/j.immuni.2017.07.012)). In general a more detailed analysis of the data is required before the authors could state "Instead, our data support a 'squelching model' whereby repression is driven by a redistribution of cofactors away from enhancers near repressed genes that become less accessible upon GC treatment yet lack GR occupancy." on page 10. The results might also be explained by competitive transcription factor binding, tethering or selective co-regulator recruitment (e.g. HDACs).
      3. The authors use U2OS-GRa cells as a second cell line. Those cells overexpress rat GRa (see DOI: 10.1128/mcb.17.6.3181) in a cell line that usually does not express GR. I am wondering to what extend the overexpression reflects residence times and GR binding kinetics of cells endogenously expressing GR (mostly to at a lower protein level). At least the number of GR binding sites as well as the number of opening chromatin sites is much higher in U2OS-GR cells the A549 cells. The authors should discuss this point with respect to the observed preservation of some GR-binding sites U2OS-GR cells after Dex treatment and washout.
      4. In figure 1 and S1, the authors show coverage plots on top of the heatmaps to show the mean signal in ATAC-Seq, GR, H3K27ac or GR signal between the different subset. These plots are statistically inappropriate as a significant portion of the enhancers does not have a signal and a few enhancers show a very strong signal (at least for H3K27ac, P300 and GR) which skews the mean. Plotting the signal distribution or the distribution of the Dex-dependent change in signal (fold-change, e.g. as violin plots) more accurately reflects the diversity in the signal response.
      5. ChIP qPCRs against histone marks in figures 5B and S2C are not normalized for histone H3, but the author's clearly see changes in nucleosomal occupancy at those sides by ATAC-Seq. Additional normalization by total H3 is highly recommended.
      6. Figures 1C, 2D, 4A/B, 5B/C/E, 6C/F, S2C/E and S3A-D lack statistics.
      7. In figure 6, the authors compare the ZBTB16 locus with FKBP5, a locus that as by the data presented is very different from the ZBTB16 locus in terms of expression level (Fig 6C/F) and H3K27me3 occupancy (Fig. 5B). The authors should compare ZBTB16 to a locus with similar expression level and H3K27me3 deposition. Especially the co-occurrence of H3K27me3 and H3K4me3 (Fig. 5B) at the ZBTB16 promoter indicates its poised chromatin state whereas the FKBP5 promoter is marked by an active chromatin state.
      8. ZBTB16 itself is a transcriptional regulator, but its elevated expression upon repeated Dex treatment does not affect other genes. How do the authors explain this observation? Is ZBTB16 elevated on the protein level as well?

      Minor Comments

      1. The authors nicely explained the data analysis of their ATAC-Seq data, I recommend to include some more information on if and how the ChIP-Seq data was normalized (library size, scaling factors or spike-ins) even if most of the data sets are published.
      2. In figures 1F and S1F, the authors show the association of opening/closing an non-changing sites and GR peaks with genes that are up/down-regulated or unchanged upon Dex treatment. This gene-centric analysis is skewed by the different sizes of up-/down regulated gene sets and opening/closing chromatin (especially for the U2OS-GR cells that have 15.6x more opening sites then closing sites). Could the authors also include a peak-centric view showing how many closing/opening and non-changing sites are associated with down/up-regulated or unchanged genes? How good is the association (correlation)?
      3. In the figures 1F and S1F it is unclear how the authors handled genes with associated peaks (within +/-50kb) that show different characteristics e.g. a gene with a peak that gains and another peak that loses accessibility. How do the authors account for >1 opening or closing peaks per gene? In relation to this. Do opening/closing sites cluster around up/down-regulated genes? What is the stoichiometry as 1.6x more closing sites (then opening sites) relate to 1/3 of repressed when compared to activated genes?
      4. The authors claim on p10 that "We could validate several examples of opening and closing sites and noticed that opening sites are often GR-occupied whereas closing sites are not occupied by GR". As most of the ChIP-Seq experiments were performed on formaldehyde-only fixed cells, the authors might miss "tethered" sides, which are mostly linked to gene repression. You might rephrase this part to most closing sites lack direct DNA binding.
      5. The P300 ChIP-Seq in Fig S1B shows less sides with P300 occupancy then sides with H3K27ac. Is this a ChIP quality issue or do other factors mediated changes in H3K27ac? Similar to mayor comment 1a, are the P300 sites on the top the same sites as the top H3K27ac sites?
      6. Please indicate the primer position of qPCR primers if the genome browser tracks are displayed. That makes the comparison of sequencing and qPCR results easier.
      7. The authors nicely show that GR binding sites with persisting accessibility after Dex treatment and washout in U2OS-GR cells show residual GR binding and are bound by GR at Dex concentrations of 0.1nM. Could the authors specify if differences in the GR motif exist between those and the non-persisting sites?
      8. The authors focus on ZBTB16, FKBP5 and GILZ to show the priming effect of glucocorticoid treatment on ZBTB16 (Fig. 4), but GILZ was not included in the initial ATAC-Seq (Fig. 1) and ATAC-Seq washout (Fig. 2) experiments. For better comparison, I recommend adding qPCR results on GILZ in figures 1 and 2.
      9. The authors indicate that the washout of Dex does restore gene expression in A549 cells to pre-Dex levels (Fig. 4). These cells did not show any persisting GR binding, so. How does the gene expression in U2OS cells behave? E.g. for the genes displayed in Fig. S2C.
      10. In Fig. S3C, the authors observe that Gilz expression in U2OS-GR cells is similarly induced upon 1st and 2nd stimulation with Dex using 4hrs treatment. How does this relate to the preserved Dex response after 20hrs treatment and washout (Fig. S2C)? Was the expression of GILZ altered after 20hrs (see comment 9)? Are H3K27ac and GR signal after 4hrs Dex stimulation and washout comparable as well? Please comment on the differences observed between the 20hrs and 4hrs experiments.
      11. The GR enhancer of ZBTB16 seems to be simultaneously marked H3K27ac and H3K27me3 (Fig. 5A). Please comment. Is this an artefact of bulk ChIP-Seq? Is this due to the different timings (H3K27me3 after 1h and H3K27ac after 3hrs)? Can both marks co-exists or do they reflect allelic differences?
      12. Please comment on the observed differences in H3K27me3 response to Dex between ChIP-Seq (Fig. 5A) and the ChIP qPCR (Fig. 5B). Is this a timing issue?
      13. Please indicate the number of replicates for the ChIP-Seq experiments in the figure legends.
      14. The statement "Upon hormone treatment, both the number of transcripts per cell and the number of transcriptional foci increases." on page 13 is confusing. Most cells only have two alleles (max. two transcription foci). Is ZBTB16 duplicated in A549 cells?
      15. ZBTB16 is marked by H3K27me3 (Fig. 5A/B). How many GR binding sites do overlap H3K27me3 in A549 cells? How many genes associated with GR/H3K27me3 sites are expressed in A549 cells? Is ZBTB16 the only one?
      16. Is ZBTB16 a GR target gene that is regulated by GR tissue-independently (like GILZ and FKBP5)?

      Significance

      The work is of significant interest as glucocorticoids (GCs) are physiologically secreted with circadian and ultradian rhythms, but widely prescribed with repeated dosing during the day (in order to maintain high GC levels) in patients during chemotherapy (doi: 10.1016/j.critrevonc.2018.04.002, doi: 10.1186/1471-2407-8-84 ) or anti-inflammatory therapy in rheumatoid arthritis (doi: 10.1186/ar4686, doi:10.1093/rheumatology/kes086) for example. Therefore the assessment of long-term versus short-term as well as the effect of repeated GC exposure on various cell types is of high interest to understand adverse effects of GC therapy. However, the choice of cell lines as model system dampens the overall translatability of the findings, as does the choice of those cell lines. Alveolar epithelial cells (A549) are not classically known as a cell type affected by GC side effects in therapy. However, GR is widely known to regulate tissue-specific gene programs (doi: 10.1038/emboj.2013.106, doi: 10.1016/j.steroids.2016.05.003, doi: 10.1016/j.molcel.2011.06.016). Hepatic, skeletal muscle cells or fat cells would reflect those tissues more accurately. Obtaining in-vivo data is hampered by the cofounding effect of endogenous glucocorticoids and their circadian expression (doi: 10.1016/j.molcel.2019.10.007 ), but primary cells would overcome those limitations and still be a closer model system the cancer cell lines.

      That the glucocorticoid receptor mostly binds accessible genomic regions and changes the DNA accessibility of a subset of binding sites after short-term treatment with Dex was previously described (doi: 10.7554/eLife.35073, doi: 10.1093/nar/gkx1044, doi: 10.1038/ng.759 ), but the reversibility of these effects were not studied before. Therefore, this study adds an interesting conceptual finding.

      The observation that ZBTB16 expression can be boosted by repeated Dex treatment is interesting and seems to be tissue independent. Again, in-vivo or patient data confirming this observation would strengthen the conclusions from this paper and exclude an artefact from immortalized (cancer) cell lines. The impact of this observation depends on ZBTB16 function and if ZBTB16 is elevated on the protein level as well.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Bothe et al investigated whether GR induced chromatin changes could be somehow preserved after inactivation of the receptor. They performed ATAC-seq to examine the status of chromatin accessibility under several treatment conditions in two different human cell lines. Their main finding is that GR changes to chromatin are universally reversable, with the exception of a tissue-specific single locus (ZBTB16). Additionally, the authors claim their data support a squelching mechanism for transcriptional repression by GR.

      Major comments:

      Are the key conclusions convincing? Should the authors qualify some of their claims as preliminary or speculative, or remove them altogether? Would additional experiments be essential to support the claims of the paper? Request additional experiments only where necessary for the paper as it is, and do not ask authors to open new lines of experimentation. Are the suggested experiments realistic in terms of time and resources? It would help if you could add an estimated cost and time investment for substantial experiments. Are the data and the methods presented in such a way that they can be reproduced? Are the experiments adequately replicated and statistical analysis adequate?

      The manuscript is very well written. The data is clearly presented. The methods are explained in sufficient detail with a few exceptions mentioned below, and statistical analysis are adequate. There are some concerns and suggestions about the experimental design and data presentation.

      • Drug treatments. It is not clear whether the cells were previously grown on charcoal-stripped serum before hormone treatments. From methods, it seems they were grown in 5% FBS and directly treated with the hormones. Also, what "hormone-free medium" mean? Is it charcoal stripped Serum or not Serum at all? Replicates for these data sets? The ATAC and Chip-Seq should have at least 2. The concordance of the ATAC-seq and Chip-seq replicates should be described and shown in supplemental figures. Fig1A - The ATAC-seq HM should be clustered to show which peaks in opening/closing and unchanged peaks also have called GR chip peaks. Showing browser shots as in Fig1B is cherry picking data and can be put in a supplementary figure as an example. This is a main point of emphasis of the manuscript so show the data. The atac peaks that do overlap with GR chip peaks should be sorted by GR peak intensity. The QPCR is then only needed to confirm the quantitative changes.

      To show both the ATAC sites and H3K27ac sites are specific to hormone treatment, a random set of 15K peaks not in this peak set also should be shown in HMs and should not change with the treatments. Why does the H3K27ac go down in the 6768 non changing sites with dex?

      The D & E parts of Fig1 can then be eliminated to become parts of Fig1A. Its not clear in the text that the HMs in Fig1 are all sorted in the same way.

      • Fig. 1b (and d). The ChIP data is from 3h-hormone treatment while the ATAC-seq data is from a 20h hormone treatment. It seems a bit misleading to directly compare GR occupancy with the state of the chromatin at different time windows. Shouldn't the authors show their ATAC-seq 4h treatment data (shown in Fig S1) here instead?
      • Fig. 1f. The authors sate "downregulated genes only show a modest enrichment of GR peaks". However, there is a significant enrichment of GR-peaks in repressive genes compared to non-regulated genes. It would be interesting to see how some of these peaks look in a browser shot. While the general conclusion "transcriptional repression, in general, does not require nearby GR binding", seems valid, the observation that many GR peaks appear directly bound to nearby repressed genes ought to be more emphatically recognized in the text.
      • Concept of naïve cells (Fig. 3A). If cells are normally grown in serum-containing media, which is known to have some level of steroids, can the cells described here as "Basal expression" be truly free of a primed state? In the first part of the experimental design (+/- 4h hormone), which type of media is present here? Is it 5% FBS? A concern is that the authors may require the assumption that the (4h + 24h) period a is sufficient to erase all memory of the cells, which is exactly what they are trying to test.

      The transcriptional memory is a second major emphasis of the paper.

      The RNA primers (Table 1) span within an exon or across 2 exons to best measure mRNA levels. The QPCR primers should span exon intron boundaries to better reflect transcriptional activity (prior to mRNA splicing) at the collection time point.

      It would be interesting to do a time course of the hormone-free period of the washout to determine the memory of the chromatin environment that results in the enhanced transcriptional response instead of just 24 and 48 hrs in A549 cells.

      Fig 5A appears to show H3K27ac overlaying H3K27me marks near the promoter of ZBTB16 and at the GR sites within the gene locus with no reduction in H3K27me levels. This seems counterintuitive and should be explained or addressed especially since the authors use quantitative comparisons of H3K27ac levels with and without treatment in other figures.

      Showing the changes of ZBTB16 upon 2nd stimulation via FISH is not terribly surprising and is even the most expected reason for higher RNA levels. Why does it only occur at that gene is a better question and is touched on in the discussion. It is more likely that this gene has a very low level of pre-hormone transcription compared to FKBP5 (see Fig 3e and the FISH images). ZBTB16 is in the lower 3rd of basemean RNA levels of GR responsive genes according to the RNAseq data. Selection of 1 or 2 other genes with similar basemean levels of RNA (from the RNA-Seq data) would make the data more

      Minor comments:

      Specific experimental issues that are easily addressable. Are prior studies referenced appropriately? Are the text and Figures clear and accurate? Do you have suggestions that would help the authors improve the presentation of their data and conclusions?

      • In the Intro (paragraph two), the authors explain the different mechanisms by which GR might repress genes. One alternative the authors appear to have missed is the possibility of direct binding to GREs while, for example, recruiting a selective corepressor such as GRIP1 (Syed et al., 2020). There are many recent critics to the notion that transrepression via tethering is responsible for GR repressive actions at all (Escoter-Torres et al., 2020; Hudson et al., 2018; Weikum et al., 2017).
      • When the authors introduce the concept of tethering to AP-1, they go way back to the first description of tethering. However, one of the references (Ref 20) actually goes against the tethering model as they did not detect protein-protein interactions between AP-1 and GR, and also, they conclude that repression requires the DNA-binding domain. -Figure 2. The authors state "This suggests that the few sites with persistent opening are likely a simple consequence of an incomplete hormone washout and associated residual GR binding". The authors should check the subcellular distribution of GR after their washout protocol. If the washout is not completed, GR should still be in the nuclear compartment.
      • The first part of the manuscript (Repression through "squelching") seems a bit disconnected from the rest of the results (reversibility in accessibility). The abstract is structured in a way that this disconnection seems much less obvious. Perhaps the authors could try to present their squelching part in the middle of the manuscript, following the flow of the abstract? This is just a suggestion.
      • Figures have CAPS panel letters (A,B,C, etc) while the text calls for lower case letter (a,b,c...)

      Escoter-Torres, L., Greulich, F., Quagliarini, F., Wierer, M., and Uhlenhaut, N.H. (2020). Anti-inflammatory functions of the glucocorticoid receptor require DNA binding. Nucleic Acids Res 48, 8393-8407. Hudson, W.H., Vera, I.M.S., Nwachukwu, J.C., Weikum, E.R., Herbst, A.G., Yang, Q., Bain, D.L., Nettles, K.W., Kojetin, D.J., and Ortlund, E.A. (2018). Cryptic glucocorticoid receptor-binding sites pervade genomic NF-kappaB response elements. Nat Commun 9, 1337. Syed, A.P., Greulich, F., Ansari, S.A., and Uhlenhaut, N.H. (2020). Anti-inflammatory glucocorticoid action: genomic insights and emerging concepts. Curr Opin Pharmacol 53, 35-44. Weikum, E.R., de Vera, I.M.S., Nwachukwu, J.C., Hudson, W.H., Nettles, K.W., Kojetin, D.J., and Ortlund, E.A. (2017). Tethering not required: the glucocorticoid receptor binds directly to activator protein-1 recognition motifs to repress inflammatory genes. Nucleic Acids Res 45, 8596-8608.

      Significance

      The study tackles two important questions. One is regarding the effects of inducible transcription factors on chromatin structure after inactivation. The second is on the mechanisms behind transcriptional repression.

      The effect of GR inactivation on chromatin accessibility has already been addressed in previous work for a single locus (Refs 38) or genome wide (Ref 33). However, the 24h temporal windows have not been addressed before. In this sense, the manuscript sheds some new light into the matter. Even though the authors conclude that accessibility is globally reversable, they only studied in detail the mechanism behind a single-locus exception.

      Regarding the mechanisms behind transcriptional repression, the authors present data supporting the squelching mechanism, which is still highly controversial.

      The manuscript will be of interest to the molecular and cell biology communities, especially those working on chromatin structure, transcription factors, gene regulation, and nuclear receptors. Overall, this is an interesting paper with somewhat limited novel findings that is suitable for publication after addressing the above comments. The rigor of the findings needs to be better described via replicates and if they have not been done, it should be a major requirement of revision.

      The reviewers specialize in transcription factor dynamics.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      We thank both reviewers for their comments on our manuscript. Our responses to their specific comments and plan to modify the manuscript are described bellow.

      Response to reviewer #1

      • *

      > Figure 1C is difficult to interpret. Am I supposed to see anything in particular in the two insets? Please provide a descriptive interpretation of the inset and let the reader know if anything in particular is to be noted.

      We agree that it is a bit difficult to interpret although the goal of these images is to show that spermatogenesis appears globally not disturbed until the histone-to-protamine transition in distorter males. We will change the legend of the figure to clarify this particular point.

      > Figure 1D. Elsewhere, in supplemental S1, they have an image for SD5/CyO. This should be provided here as a control. As presented, the genetics don't prove that the interaction between SD5 and Gla is the cause of the phenotype. As presented in the figure, the effect could be caused by SD5 alone, independent of Gla. In S1, this is not the case - SD5 with CyO doesn't produce the phenotype. Likewise, I think they should provide the SD5/CyO image in S1A in Figure 1C.

      We can add the images of the SD5/CyO genotype (currently in FigS1) in Fig1C (whole testis) and in Fig1D (single cyst). We also suggest to present in this figure the other distorter genotype cn bw/CyO (which is currently in Fig S1). However, because the modified figure is going to be too big, we also suggest to split Figure 1 in two Figures with Figure 2 presenting FISH results including all controls. In this case, we will remove the supplemental figure 1.

      > Figure 1E. These images are the formal proof (especially for Gla/SD5 genotype) that the large Rsp array is on the chromosomes that seem destined for removal from the cyst. However, there is no control. The authors should provide FISH results for the genotypes Gla/CyO and, ideally, also cn1 bw1/Cyo.

      We agree with reviewer #1 and will provide images of the Gla/CyO control and cn1 bw1/Cyo in a new Figure 2 as explained above.

      > Figure 2. Keeping consistent with other figures, can the Gla/SD5 panel be in the middle?

      Yes. We swapped the Gla/SD5 and cn bw/SD-Mad panels.

      Also, shouldn't there be SD5/CyO in Figures 2, 3 and 4, to demonstrate that the phenotypes are the result of the interaction rather than just SD5? I am OK with providing just the cn 1 bw1/SD-Mad here alone, since it is simply contrasted with Gla/SD5.

      We agree that it would be better to show also the SD5/CyO controls. However, we chose to show only one control (Gla/CyO) to make the figures easier to read. We thus suggest to provide all images of the SD5/CyO genotype in supplemental figures.

      > Figure 3A. In the scheme, can you provide greater detail as to where F-Actin is expected?

      The scheme was modified to clarify this point.

      > Figure 3B. It is stated that there is a size difference in the nuclei for IC stage and greater variation in ProtB-GFP staining within bundles. Can there be an effort to quantify these observations?

      It would be difficult to quantify ProtB-GFP signal intensity and nuclear size in IC stage cyst because nuclei are very close to each other. The best way would be to squash testes to spread spermatid nuclei but there might be a bias on nuclear size/shape due to the squashing procedure. In addition, on squashed preparations, it is difficult to be sure that the nuclei analyzed and compared belong to the same cyst. We agree that quantification would help to describe the phenotype better but we think that the best read-out of the different SD phenotypes is the quantification of number of abnormally compacted nuclei in seminal vesicles which is provided later in the manuscript.

      > Figure S4. There doesn't appear to be the same phenotype for Gla/SD-Mad (DAPI, ProtB-GFP) in post-IC stage bundles compared to what is seen in 3C for Gla/SD-5. In particular, in figure 3, the defective nuclei seem to be trailing, but in S4, while the bundle appears disorganized, there doesn't appear to be the trailing nuclei. Is this difference real or is it just the result of a single picture contrast? Some clarification could be helpful.

      Actually, the images that were shown on Figure 3C for Gla/SD5 post-IC probably show the SD5 nuclei of one cyst (the normal one) and the Rsp nuclei being eliminated from another cyst (these are trailing behind nuclei which are too far to be included in the same image). We thus changed the images for Gla/SD5 for an image which looks like the one shown for Gla/SD-5 genotype for clarity.

      We did not mention this observation in the manuscript but we actually see cysts in which abnormally-shaped nuclei are trailing behind the normal nuclei and sometimes IC cones around the abnormally shaped nuclei seem to be stuck close to the normal nuclei which are already individualized. It might be possible that IC progression around abnormal nuclei is slowed down compared to normal nuclei. The difference could not reflect different phenotypes but more likely different states of a dynamic process.


      Response to reviewer #2


      Reviewer #2 had no specific comments.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      SD is a multi-component system, where two major factors Sd (a truncation allele of RanGAP that mislocalizes) and Rsp, a satellite DNA (whose copy number determines sensitivity to RanGAP distorting allele).

      This study by Herbette et al. provide cytological characterization of Drosophila SD (segregation distortor), a male meiotic drive system, focusing on the process of histone-to-protamine transition. By thoroughly studying multiple alleles of SD, they find that the mechanisms by which SD accomplishes segregation distortion are not uniform. In some cases, spermatogenesis is perturbed at the level of protamine incorporation and in other cases, mature sperm can be generated yet they exhibit distorted segregation.

      In one combination Gla/SD5, histone elimination is delayed (never complete), whereas cn bw/SD-Mad exhibit normal timing in histone elimination/protamine incorporation, although these two combinations result in similar, severe degree of distortion. They further show that DNA compaction is incomplete in these SD alleles (again more severe in Gla/SD5 condition) by using dsDNA antibody. Interestingly, defective spermatids in Gla/SD5 combination never progress to sperm maturation and enter seminal vesicle, defective spermatids in cn bw/SD-Mad combination are capable of entering seminal vesicle, but likely fail to fertilize or develop after fertilization, resulting in distortion.

      This is a well-done study and provides important insights into the mechanisms of segregation distortion in the Drosophila melanogaster SD system. The quality of data is high, and I don't have any major concerns on this manuscript. Of course, the exact mechanisms of how SDs drive (i.e. why Rsp(S) alleles fail to condense properly, and how it is related to the Rsp copy number) remains unclear, this study provides a significant step forward to tackle this fascinating phenomenon of segregation distortion.

      Significance

      This study provides important insights into the underlying mechanism of segregation distortion during D. melanogaster spermatogenesis. Segregation distortion is a fascinating phenomenon of significant interest in evolutionary biology. Thorough cytological characterization of spermatogenesis phenotype that leads to segregation distortion provides much needed information, and this study is a significant step forward to understand how meiotic drivers might exploit the system to distort segregation for their advantage.

      Referees cross-commenting

      I think I and reviewer #1 seems to be in good agreement. I don't have anything in particular to add. This is a nice paper.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      This is a very nice paper that combine cytology and genetics to provide insight into the mechanism of segregation distortion in the Drosophila SD system. The conclusions are well supported with multiple different experiments from different in angles. By using different genetic backgrounds - their conclusion that Rsp abundance dictates distinct outcomes is well supported. My primary suggestion is that they include a few more controls and provide some additional quantitative analysis. In some cases, quantitative conclusions are made without sufficient support.

      Specific Comments.

      Figure 1C is difficult to interpret. Am I supposed to see anything in particular in the two insets? Please provide a descriptive interpretation of the inset and let the reader know if anything in particular is to be noted.

      Figure 1D. Elsewhere, in supplemental S1, they have an image for SD5/CyO. This should be provided here as a control. As presented, the genetics don't prove that the interaction between SD5 and Gla is the cause of the phenotype. As presented in the figure, the effect could be caused by SD5 alone, independent of Gla. In S1, this is not the case - SD5 with Cyo doesn't produce the phenotype. Likewise, I think they should provide the SD5/CyO image in S1A in Figure 1C.

      Figure 1E. These images are the formal proof (especially for Gla/SD5 genotype) that the large Rsp array is on the chromosomes that seem destined for removal from the cyst. However, there is no control. The authors should provide FISH results for the genotypes Gla/CyO and, ideally, also cn1 bw1/Cyo.

      Figure 2. Keeping consistent with other figures, can the Gla/SD5 panel be in the middle? Also, shouldn't there be SD5/CyO in Figures 2, 3 and 4, to demonstrate that the phenotypes are the result of the interaction rather than just SD5? I am OK with providing just the cn 1 bw1/SD-Mad here alone, since it is simply contrasted with Gla/SD5.

      Figure 3A. In the scheme, can you provide greater detail as to where F-Actin is expected?

      Figure 3B. It is stated that there is a size difference in the nuclei for IC stage and greater variation in ProtoB-GFP staining within bundles. Can there be an effort to quantify these observations?

      Figure S4. There doesn't appear to be the same phenotype for Gla/SD-Mad (DAPI, protoB-GFP) in post-IC stage bundles compared to what is seen in 3C for Gla/SD-5. In particular, in figure 3, the defective nuclei seem to be trailing, but in S4, while the bundle appears disorganized, there doesn't appear to be the trailing nuclei. Is this difference real or is it just the result of a single picture contrast? Some clarification could be helpful.

      I think this is a nice paper and I enjoyed reading it very much. The combination of the genetics (different RSP alleles from nature and the different X chromosomes) with the cytology provide a very reasonable explanation for why different genotypes seem to yield different effects. It provides some reconciliation among previous studies.

      Significance

      I think it is very significant.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      We are grateful for the careful read and constructive comments provided by the 3 reviewers assigned to our manuscript. Each reviewer provided thoughtful and clearly structured comments that helped us to better clarify points or summarize results in the manuscript that they indicated were not presented clearly or completely. We have revised the manuscript to address the points raised by the reviewers, incorporating edits and additional text throughout the manuscript, figure legends, and supplemental materials. We feel the revised version of the manuscript is much improved as a result of the revisions in response to the reviewers.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Summary:

      The authors demonstrate a powerful method utilizing mNGS of individual mosquitoes utilizing reference-free analysis. This allows researchers to combine the resulting datasets of mosquito identification, blood-meal source, microbiome, viral sequencing, etc. Such knowledge could be a useful tool in detecting and responding to transmission of mosquito-borne diseases that affect human or animal populations, even though the technology is currently likely too expensive for widespread use (as acknowledged by the authors).

      Major Comments:

      No major revisions requested.

      The authors provide their detailed methodology, including code, allowing for replication by other groups.

      Minor Comments:

      The authors' discussion of using this technique in order to detect pathogens should be qualified regarding detection vs possible transmission. Detecting a virus in an engorged mosquito does not necessarily mean that said mosquito can transmit the virus, but may have simply acquired it from a recent blood meal. The same can be said of detecting a plant pathogen following a recent sugar meal.

      From the methods, it seems that mosquitoes were not washed prior to processing. This may make it difficult to discriminate between internal and external microbiota as well as lead to cross-contamination of surface microbiota between mosquitoes collected in the same trap.

      Significance

      This work currently would be of interest to other research groups examining the co-occurence of pathogens, other microbiota, and blood meals for field collected mosquitoes. While of great potential application to public health surveillance, the current cost is likely prohibitive.

      My field of expertise is virology and vector biology with minimal background in NGS.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary:

      In this study, the authors utilized unbiased meta-transcriptomic in sequencing 148 diverse wild-caught mosquitoes (Aedes, Culex, and Culiseta ​mosquito species) collected in California, with main aim of detecting sequences of eukaryotic, prokaryotic and viral origin. Their results show that majority of their sequenced data assembled into contigs corresponding to viral genomes. In their data, 7.4 million viral reads clustered as +ssRNA viruses including ​Solemoviridae, Luteoviridae, Tombusviridae, Narnaviridae, Flaviviridae, Virgaviridae, and Filovirida​ whereas 2.25 million viral reads identified as -ssRNA viruses comprising of ​Peribunayviridae, Phasmaviridae, Phenuiviridae, Orthomyxoviridae, Chuviridae, Rhabdoviridae, and Ximnoviridae​. With 0.94 million viral reads, dsRNA viruses formed the third most abundant virus category with viruses under families ​Chrysoviridae, Totiviridae, Partitiviridae, and Reoviridae. Under the prokaryotic taxa, Wolbachia​ species was the dominant group, followed by other lower abundance bacterial taxa that includes Alphaproteobacteria, Gammaproteobacteria, Terrabacteria group, and Spirochaetes. Trypanosomatidae was the most dominant eukaryotic taxa, followed up by reads from ​Bilateria​ and Ecdysozoa taxa. Ultimately, this study demonstrates that single mosquito meta-transcriptomic analysis has potential in identifying vectors of human health significance, potent emerging pathogens being transmitted by them and their reservoirs all in one assay.

      Major comments:

      1.Are the key conclusions convincing? The conclusions are accurate.

      2.Should the authors qualify some of their claims as preliminary or speculative, or remove them altogether? None. The study's results, discussion and conclusion are appropriate.

      3.Would additional experiments be essential to support the claims of the paper? Request additional experiments only where necessary for the paper as it is, and do not ask authors to open new lines of experimentation.

      As much as the authors describe the use of mNGS as a tool in validating mosquito species and providing an unbiased look at the vector-associated pathogens, it is still prudent for them to use qPCR to validate the obtained RNASeq data (e.g. validation of the viral sequences).

      4.Are the suggested experiments realistic in terms of time and resources? It would help if you could add an estimated cost and time investment for substantial experiments. The outlined methodology is realistic.

      5.Are the data and the methods presented in such a way that they can be reproduced? The methodology is reproducible.

      6.Are the experiments adequately replicated and statistical analysis adequate? Yes

      Minor comments:

      1.Specific experimental issues that are easily addressable. qPCR validation the obtained RNASeq data should be conducted.

      2.Are prior studies referenced appropriately? The recently publications about mosquito microbiome/virome should be added. (eg.  doi: 10.1128/mSystems.00640-20.)

      3.Are the text and figures clear and accurate? The resolution for Fig 4, Fig 6, SFig 2, SFig 4, and SFig 5 is poor. The author should update them.

      4.Do you have suggestions that would help the authors improve the presentation of their data and conclusions? (1)in the method section, the mosquito has been washed to avoid the contamination from the environment before RNA extraction? (2)most part of non-host reads are matched to the viruses (10.5M), however only few of them were belong to the prokaryotes, does it means mosquito carries more viruses than prokaryotes. (3)none of the mosquito-borne virus known to occur in California (eg. WNV, SLEV, WEEV, ) has been found in Table 1 for the virus detected with complete genome in this study. In contigs level, did the author detected any mosquito-borne virus known to occur in California. Since the mNGS is very sensitive and this study include large sample numbers, why no known mosquito-borne virus was detected in their study should be discussed.

      Significance

      1.Describe the nature and significance of the advance (e.g. conceptual, technical, clinical) for the field. With the existential threat of emerging novel pathogens of global health concern, efficient and rapid public health surveillance strategies are crucial in monitoring and possibly averting such eventual calamities. Specifically, mosquitoes are widely diverse and are known to harbor and transmit various pathogenic agents to humans and animals. Thus, this rapid identification of relevant vector species, pathogens and their reservoirs in one assay is a promising and convenient aspect of surveillance in the public health sector.

      2.Place the work in the context of the existing literature (provide references, where appropriate). Shi et al reported the first single mosquito viral metagenomics study, in which her and the team demonstrated the feasibility of using single mosquito for viral metagenomics, a methodology that has potential to provide much more precise virome profiles of mosquito populations. In the present study, the authors have gone a step higher by aiming to combine three objective points in single mosquito meta-transcriptomic, as described in brief in their abstract and the comprehensive methodology outline.

      Reference: Shi, C., Beller, L., Deboutte, W. et al. Stable distinct core eukaryotic viromes in different mosquito species from Guadeloupe, using single mosquito viral metagenomics. Microbiome 7, 121 (2019). https://doi.org/10.1186/s40168-019-0734-2

      3.State what audience might be interested in and influenced by the reported findings. The methodology and findings described in this manuscript are important in advancing the public health field of vector surveillance. The identification of relevant vector species, pathogens and their reservoirs in one assay is a promising and convenient aspect of surveillance.

      4.Define your field of expertise with a few keywords to help the authors contextualize your point of view. Indicate if there are any parts of the paper that you do not have sufficient expertise to evaluate.

      I am an Associate Professor at a research institute. My lab research work focuses on Arbovirology studies, more specifically vector surveillance of known and novel viruses associated with mosquitoes and ticks, mosquito-transcriptomic studies, mosquito viruses tropism studies and other related mosquito-virus interaction studies.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      This is a very interesting and well designed study on mNGS of mosquitoes. The authors demonstrate that they can distill valuable information on the vector species, the source of the blood meals and the microbiome/virome using a simple experimental approach and using single mosquitoes. A highlight of the work is that the paper is very comprehensive with an overwhelming dataset and thoughtful analysis. It is a showcase how sequencing data from a relative compact number of mosquitoes specimens can be used to conduct sophisticated computational analysis leading to meaningful conclusions. The authors make a strong case for the power of mNGS of mosquitoes that may be applicable to other (invertebrate) species. Especially the phylogenetic analysis based on SNP distance without have reference genomes and the grouping of contigs by means of co-occurence in datasets is original. We feel that the work deserves to be published.

      Significance

      We have a number of comments that the authors may consider in further improving the quality of their manuscript:

      What is the impact of this paper?

      I think it is possible that the paper will have a decent impact on the mosquito arbovirus field, because it adequately shows the possibilities that individual mosquito sequencing can bring (e.g. co-occurrence analysis). It may shift the balance to doing more individual mosquito sequencing instead of pools. The paper is also very extensive in the analyses that it does on this very rich data set. Below, some suggestions are given for additional analysis, which should be interpreted as a compliment to the interesting data set acquired. It should however be noted that the ideas and approaches taken are not entirely new. Sequencing individual mosquitoes, co-occurrence analysis and metagenomic sequencing have been done before, although not to this extent and not in this field. Several novel possibilities:

      1. An unbiased way to check if you have the correct mosquito species and the ability to detect subspecies. Using the genetic distance of the transcriptomes they have likely corrected the missed identification in some samples, where these calls had a logical mistake made. The fact that subspecies overlapped with the sites of capture is very interesting and confirms the relevance of looking at the genetic distance also within species.
      2. Blood-meal analysis from sequence data. Here they can get to species level for 10 out of 40 blood-engorged mosquitoes. The idea is interesting, as you would be able to get a lot more information if you can determine blood-meal origin from RNA-seq data (as shown in this paper). However, I feel that in the current paper (and this may be intentional) they do not properly show that RNA-seq is an adequate alternative to DNA sequencing of the blood. To convince me, I would have liked to have these results compared to DNA sequencing and see how much overlap there is. I understand however that the choice was made not to do this, but I do have a small note for the information given now. It was mentioned that 1 contig with an LCA of vertebrates is enough for a 'blood-meal origin' call. I am however left to wonder how reliable is 1 read? Are there really no contigs with an LCA in vertebrates in the non blood-fed mosquitoes? Also, what do we think happened in the mosquitoes that were visibly bloodfed but nothing was found; any speculation?
      3. The study of co-occurrence, although not novel, is a nice addition to the mosquito virome/microbiome determination field. Identifying novel segments and missed segments of viruses is very nice. I do however wonder: did it ever occur that co-occurrence finds a 'linked' fragment that was clearly wrong? Were some post-analyses done to check if the results make sense? It seems, especially because the paper elaborates on examples, that you need some follow-up. This is not problematic, but a nice addition to the paper would be (as is also described below) to mention which segments were added to viral genomes by co-occurrence and if some checks were done to verify these hits.
      4. Being able to say something about differences in viruses within the same mosquito species is super interesting. Pools do not give the possibility to say something about profiles and prevalence and the large size (148 mosquitoes) allows to find interesting correlations.

      What parts do you think are problematic?

      1. We question the validity 'blood-meal calls' as outlined above.
      2. In this study they use % of non-host reads as a measure for the abundance of a pathogen (see e.g. Figure 3). I don't understand this at all... If you have more pathogens, then the amount of non-host reads would have to go up right? It seems to assume that the amount of non-host reads you have is similar in all samples? It becomes even more problematic when the trend is mentioned that having a higher % of non-host reads for Wolbachia is related to a lower % of non-host reads for viruses. This seems to be trivial as the amount of non-host reads goes up with increased Wolbachia infection, and therefore the % of non-host reads for viruses goes down due to the larger denominator. A different number than 'non-host reads' should be taken to normalise the data and say something about abundance. E.g. host reads or spiked RNA?

      What are the most relevant questions you are left with?

      1. I am curious about the limited overlap with Sadeghi et al., 2018, who sequenced so many Culex mosquitoes in California. I would suggest to say a little but more about these discrepancies and their potential causes in the discussion.
      2. What do the authors think are in those 'dark reads'? Is the amount of dark reads the same across the different samples? Similarly, are the 'tetrapoda' reads reduced/absent in mosquitoes with a reference genome available?
      3. In the first part of the results, mention is made to being able to characterize to kingdom level 77% of the 13 million non-host reads (also see comment on non-host reads below). I am however puzzled with the description in the text and supplemental figure 3: which 3 million contigs were not able to be characterized? Where in supplemental figure 3 are they? This is especially puzzling as the main text mentions that 11 million non-host reads are from complete viral genomes, 0.9 million to eukaryotic taxa and 0.7 million to prokaryotic taxa?
      4. There seem to be 131 bars, corresponding to individual mosquitoes, in figure 3? Where are the remaining 17?

      What are your tips (in addition to responses to above questions)?

      1. I think the definition of 'non-host reads' needs to be clearly made and used consistently across the document. At the end of the paragraph 'Comprehensive and quantitative analysis of non-host sequences detected in single mosquitoes' the concept of "...13 million non-host reads..." is introduced. At first glance of supplemental figure 3 it seems that "non-host reads" could also be defined as the 16.7 aligned reads that are left after putative host sequences are removed. Although it is true that the derivation of 13 million is explained in the figure text of supplemental figure 3, it may be easier for the reader (as it cost me some time) to explain this in the main text. In addition, is the definition of 'non-host reads' (corresponding to 13-million reads) corresponding to "classified non-host reads" in the following excerpt: "For every sample, "classified non-host reads" refer to those reads mapping to contigs that pass the above filtering, Hexapoda exclusion, and decontamination steps. "Non-host reads" refers to the classified non-host reads plus the reads passing host filtering which failed to assemble into contigs or assembled into a contig with only two reads."? This caused some confusion.
      2. I believe it would be a valuable addition to add a table for the viruses which includes: 1) How it was determined that the complete genome is there, 2) The percentage overlap for those segments that were identified with blast and 3) Which viruses were already known.
      3. Have the numbers of the caught mosquitoes somewhere written out in the materials and methods.
      4. Pg2 L1-3: "Metagenomic sequencing..... a single assay." Perhaps a bit early for this statement. Would suggest to place it two paragraphs later before:"Here, we analyzed...."
      5. Figure S4 is too pixelated to read. Perhaps due to pdf conversion, but please do check before submission.
    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      Many flatworm species reproduce asexually, by fission, and the process relies on the activity of stem cells (neoblast), which drive regeneration. The question that this work tries to address is what is the dynamics of stem cells in this process, including how many stem cells contribute to regeneration, what are the mutation rates and selection mechanisms, if any. Towards this, the authors tracked one specimen of planarian Girardia tigrina for more than ten rounds of fission, and re-sequenced its genome at multiple time points and applied methods of population genetics to analyze and model the data. The main conclusion of the work is that there is high somatic mutation rate, rapid loss of heterozygosity, and a small size of the stem cell population that contributes to regeneration after fission.

      Reviewer #1 (Significance (Required)):

      The work has value, since it provides a framework to address the evolutionary aspects of stem cell dynamics in flatworms. However, as the authors point multiple times, there are many unknown biological parameters, such as, for example, the ratio of cell to organism regeneration (g), and simplifications, which can significantly influence the results. For this reason, the authors provide a range of estimates for somatic mutation rates and the effective stem cell population size, rather than some final conclusions. As the authors point out, further work will be needed to refine the model but generating new data for that is beyond the scope of this manuscript. As such, I find this manuscript is an important initial contribution to the field of stem cell population dynamics in flatworms, and its methods, results and conclusions convincing. I don't have further suggestions for improving this manuscript.

      Thank you very much for your positive assessment of our work.

      **Referees cross-commenting**

      I agree with the suggestion of Reviewer #2 that repeating the analysis on additional contings, instead of focusing only on one longest contig in the assembly, will be useful.

      We will process and analyze a few additional contigs to evaluate genomic variation in transmission of somatic variants in this system.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      **Summary:**

      Provide a short summary of the findings and key conclusions (including methodology and model system(s) where appropriate).

      The authors aimed to use population genetics to determine the number of stem cells active in each cycle of regeneration or the equality of their relative contributions in planarians. They approached this by establishing a population with serial fission from one wild isolate of Girardia cf. tigrina collected in Italy. They used next generation sequencing to sample variants of regenerated worms at different generations of fissioning. They estimated the effective population size of stem cells to be a few hundreds, besides calculation of nucleotide diversity and somatic mutation rate. They propose small effective number of propagating stem cells might contribute to reducing reproductive conflicts in clonal organisms.

      **Major comments:**

      • Are the key conclusions convincing?

      The mutation rate is reasonable. The effective stem cell population size and the genetic diversity may vary between different species. A small effective stem cell population size is not counter intuitive.

      Generally, the work is interesting and deserves to be published.

      • Should the authors qualify some of their claims as preliminary or speculative, or remove them altogether?

      The current analysis is based on many assumptions, one single set of experiments and a genome that is not well assembled. The authors have been careful with their language and documented the limitations in discussion.

      • Would additional experiments be essential to support the claims of the paper? Request additional experiments only where necessary for the paper as it is, and do not ask authors to open new lines of experimentation.

      I will feel more comfortable if the authors can repeat the analysis with two more random long contigs to have a better idea if the localization of markers impacts the conclusion. The concern is if different parts of the genome behave differently and if the Girardia genome is highly repetitive. As the pipeline of analysis is established, I expect this can be completed in a month with no experimental cost.

      We will process and analyze a few additional contigs to evaluate genomic variation in transmission of somatic variants in this system.

      • Are the suggested experiments realistic in terms of time and resources? It would help if you could add an estimated cost and time investment for substantial experiments.

      Yes

      • Are the data and the methods presented in such a way that they can be reproduced?

      Yes

      • Are the experiments adequately replicated and statistical analysis adequate?

      Yes

      **Minor comments:**

      • Specific experimental issues that are easily addressable.

      Yes.

      • Are prior studies referenced appropriately?

      Yes.

      • Are the text and figures clear and accurate?

      Yes.

      • Do you have suggestions that would help the authors improve the presentation of their data and conclusions?

      Yes. Please also see the significance section.

      Specifically, my concerns are about writing and the context of current study.

      "Small effective number of propagating stem cells might contribute to reducing reproductive conflicts in clonal organisms." is confusing. Not a good abstract ending sentence for the work presented. Reproductive conflicts need clarification. How the work relates to that concept needs support. I recommend keeping the interpretations simple and focused on the data.

      Please see response under “Significance”.

      Reviewer #2 (Significance (Required)):

      Both questions the authors attempt to address, the genetic diversity of clonal animals and the number of stem cells contributing to regeneration, are interesting and important. The combination of these two is a bit odd in the manuscript. In other words, the population genetics approach did not address the cell biology question:how many or what proportion of stem cells are active in each cycle of regeneration. I would recommend the authors to focus the writing on one question only: the genetic diversity and evolution of a clonal species, which is driven by stem cell genome evolution and the process of regeneration. The cell biology question, phrased by the author in the abstract and introduction, need to be resolved by cell biologists. I understand the appeal to put the current study in the context of regeneration research. A balance should be achieved. Currently, the second sentence of the abstract and the first paragraph of introduction are odd and misleading. The first paragraph of the introduction can be a second paragraph to introduce the planarian system for the study.

      We will restructure the manuscript to clearly separate the findings that arose directly from experimental (sequencing) data i.e. magnitude and inheritance pattern of somatic variation, and the findings that were inferred from our approximate population genetic model and depend on the unknown parameter g i.e. the effective number of stem cells and the somatic mutation rate. We will emphasize the distinction. The statements that are tangentially relevant and are not directly supported by our analyses will be modified or removed.

      In the context of genetic diversity of clonal species, many studies shall be referenced. It is interesting as well to draw comparisons with other species. Asexual planarians are unique and interesting in that space.

      Thus said, the attempt to examine stem cell population genetics is especially interesting and important as the fissiparous planarians do not undergo bottleneck selection by zygotes. In the context of recent progress studying planarian genetic diversity (Nishimura, O. et al. 2015, Guo, L. et al. 2016), Asgharian H. et al.'s work is timely and an important contribution to planarian researchers and evolutionary biologists. The question has general interest to cancer biologists as well. The manuscript does not have the data and is not written in a way to reach such broader audiences yet. A community is growing to address these questions.

      We agree with the reviewer’s point about the pioneering works of Nishimura et al. 2015 and Guo et al. 2016. Both papers were indeed cited in our manuscript. We will cite more studies pertaining to the question of somatic genetic diversity in planarians.

      The study of planarian genetic diversity has just started with two publications (Nishimura, O. et al. 2015, Guo, L. et al. 2016). It is reasonable to have lots of limitations and assumptions in the manuscript. The work is an interesting piece to be published, assuming the major points listed in the review is addressed. The reported findings will be part of the early literature and inspiration for planarian researchers and evolutionary biologists. I expect many more future manuscripts will be published, either to reexamine the reported findings or to push our understanding of the question deeper.

      Thank you very much for this assessment. We fully agree.

      My expertise is with planarian biology, genome, genetics, and diversity. I do not have sufficient expertise to evaluate the equations used in the study.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary:

      Provide a short summary of the findings and key conclusions (including methodology and model system(s) where appropriate).

      The authors aimed to use population genetics to determine the number of stem cells active in each cycle of regeneration or the equality of their relative contributions in planarians. They approached this by establishing a population with serial fission from one wild isolate of Girardia cf. tigrina collected in Italy. They used next generation sequencing to sample variants of regenerated worms at different generations of fissioning. They estimated the effective population size of stem cells to be a few hundreds, besides calculation of nucleotide diversity and somatic mutation rate. They propose small effective number of propagating stem cells might contribute to reducing reproductive conflicts in clonal organisms.

      Major comments:

      • Are the key conclusions convincing?

      The mutation rate is reasonable. The effective stem cell population size and the genetic diversity may vary between different species. A small effective stem cell population size is not counter intuitive.

      Generally, the work is interesting and deserves to be published.

      • Should the authors qualify some of their claims as preliminary or speculative, or remove them altogether?

      The current analysis is based on many assumptions, one single set of experiments and a genome that is not well assembled. The authors have been careful with their language and documented the limitations in discussion.

      • Would additional experiments be essential to support the claims of the paper? Request additional experiments only where necessary for the paper as it is, and do not ask authors to open new lines of experimentation.

      I will feel more comfortable if the authors can repeat the analysis with two more random long contigs to have a better idea if the localization of markers impacts the conclusion. The concern is if different parts of the genome behave differently and if the Girardia genome is highly repetitive. As the pipeline of analysis is established, I expect this can be completed in a month with no experimental cost.

      • Are the suggested experiments realistic in terms of time and resources? It would help if you could add an estimated cost and time investment for substantial experiments.

      Yes

      • Are the data and the methods presented in such a way that they can be reproduced?

      Yes

      • Are the experiments adequately replicated and statistical analysis adequate?

      Yes

      Minor comments:

      • Specific experimental issues that are easily addressable.

      Yes.

      • Are prior studies referenced appropriately?

      Yes.

      • Are the text and figures clear and accurate?

      Yes.

      • Do you have suggestions that would help the authors improve the presentation of their data and conclusions?

      Yes. Please also see the significance section. Specifically, my concerns are about writing and the context of current study.

      "Small effective number of propagating stem cells might contribute to reducing reproductive conflicts in clonal organisms." is confusing. Not a good abstract ending sentence for the work presented. Reproductive conflicts need clarification. How the work relates to that concept needs support. I recommend keeping the interpretations simple and focused on the data.

      Significance

      Both questions the authors attempt to address, the genetic diversity of clonal animals and the number of stem cells contributing to regeneration, are interesting and important. The combination of these two is a bit odd in the manuscript. In other words, the population genetics approach did not address the cell biology question:how many or what proportion of stem cells are active in each cycle of regeneration. I would recommend the authors to focus the writing on one question only: the genetic diversity and evolution of a clonal species, which is driven by stem cell genome evolution and the process of regeneration. The cell biology question, phrased by the author in the abstract and introduction, need to be resolved by cell biologists. I understand the appeal to put the current study in the context of regeneration research. A balance should be achieved. Currently, the second sentence of the abstract and the first paragraph of introduction are odd and misleading. The first paragraph of the introduction can be a second paragraph to introduce the planarian system for the study.

      In the context of genetic diversity of clonal species, many studies shall be referenced. It is interesting as well to draw comparisons with other species. Asexual planarians are unique and interesting in that space.

      Thus said, the attempt to examine stem cell population genetics is especially interesting and important as the fissiparous planarians do not undergo bottleneck selection by zygotes. In the context of recent progress studying planarian genetic diversity (Nishimura, O. et al. 2015, Guo, L. et al. 2016), Asgharian H. et al.'s work is timely and an important contribution to planarian researchers and evolutionary biologists. The question has general interest to cancer biologists as well. The manuscript does not have the data and is not written in a way to reach such broader audiences yet. A community is growing to address these questions.

      The study of planarian genetic diversity has just started with two publications (Nishimura, O. et al. 2015, Guo, L. et al. 2016). It is reasonable to have lots of limitations and assumptions in the manuscript. The work is an interesting piece to be published, assuming the major points listed in the review is addressed. The reported findings will be part of the early literature and inspiration for planarian researchers and evolutionary biologists. I expect many more future manuscripts will be published, either to reexamine the reported findings or to push our understanding of the question deeper.

      My expertise is with planarian biology, genome, genetics, and diversity. I do not have sufficient expertise to evaluate the equations used in the study.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Many flatworm species reproduce asexually, by fission, and the process relies on the activity of stem cells (neoblast), which drive regeneration. The question that this work tries to address is what is the dynamics of stem cells in this process, including how many stem cells contribute to regeneration, what are the mutation rates and selection mechanisms, if any. Towards this, the authors tracked one specimen of planarian Girardia tigrina for more than ten rounds of fission, and re-sequenced its genome at multiple time points and applied methods of population genetics to analyze and model the data. The main conclusion of the work is that there is high somatic mutation rate, rapid loss of heterozygosity, and a small size of the stem cell population that contributes to regeneration after fission.

      Significance

      The work has value, since it provides a framework to address the evolutionary aspects of stem cell dynamics in flatworms. However, as the authors point multiple times, there are many unknown biological parameters, such as, for example, the ratio of cell to organism regeneration (g), and simplifications, which can significantly influence the results. For this reason, the authors provide a range of estimates for somatic mutation rates and the effective stem cell population size, rather than some final conclusions. As the authors point out, further work will be needed to refine the model but generating new data for that is beyond the scope of this manuscript. As such, I find this manuscript is an important initial contribution to the field of stem cell population dynamics in flatworms, and its methods, results and conclusions convincing. I don't have further suggestions for improving this manuscript.

      Referees cross-commenting

      I agree with the suggestion of Reviewer #2 that repeating the analysis on additional contings, instead of focusing only on one longest contig in the assembly, will be useful.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      We want to thank the reviewers for their careful evaluation of our work and their helpful suggestions. We provide at the end of this letter a point by point response of how we aim to address their concerns, which can be summarised in the following main points:

      1-We will provide further evidence for the efficiency and dynamics of beta-catenin deletion in adult neural stem cells in vivo (point raised by both reviewers).

      We fully agree that although we tested for the disappearance of beta-catenin transcripts in sorted NSCs after deletion, providing further proof of the absence of beta-catenin protein in these cells will help strengthen our conclusions. For this, we are performing additional stainings for beta-catenin and Wnt/beta-catenin targets, together with neural stem cell markers, to quantify the loss of beta-catenin and Wnt/beta-catenin signalling in NSCs at P90 (30 days after deletion), as well as new P150 samples (90 days after deletion).

      2-We will investigate in further detail the effects (or lack of effect) of beta-catenin deletion on adult neurogenesis.

      The focus of our work is the effect of Wnt/beta-catenin signalling on NSCs. Nevertheless, we agree with reviewer 2 that extending our analysis to later stages in the neurogenic process will be of importance to better contrast our results with previous reports identifying a role for Wnt in neuronal production in the adult hippocampus. We are currently processing new material from mice in which beta-catenin was deleted at P60 and brains collected after 3 months to evaluate the long-term effects of beta-catenin deletion on the neurogenic output of NSCs. We will also perform stainings of Wnt-responsive neuronal genes, such as NeuroD1 and Prox1, at P90 and P150 in both control and beta-catenin cKO mice.

      3- We are aiming to confirm that the in vitro effects of CHIR99021 on NSCs are mediated by beta-catenin. We already provide evidence that stimulation with Wnt3a has the same effect as inhibition of GSK3beta by CHIR99021. To further prove the link of the observed effects to Wnt-beta-catenin signalling, we will repeat some of our key experiments using beta-catenin floxed cells (both induction of neuronal differentiation and re-activation from quiescence) as reviewer 1 suggests.

      Reviewer #1

      Overall, the results are reliable and important for the field. However, several points need to be addressed and clarified to support their conclusion. I am hopeful that the authors find my comments helpful and constructive.

      Many thanks for your insightful comments, we believe they will indeed help us improve our manuscript.

        • Validation of cKO in vivo.*
      • Although the authors validated cKO of beta-catenin in vivo using FACS/qPCR at the transcript level, it would be important to check when and to what extent beta-catenin proteins are downregulated in qNSC/activeNSCs in vivo. This will be easily assessed by immunohistochemistry. In the same line, although the authors confirmed the reduction of beta-catenin signaling using beta-gal signaling in cKO mice, it would be important to check if this can be cross-checked by staining the nuclear localization of beta-catenin. This confirmation would strength the authors statement and clear that some remained beta-catenin at the plasma membrane may not be compensating their function.*

      • Independent of the confirmation of beta-catenin cKO, it would be important to check if the downstream targets of Wnt/beta-catenin signals (ex. Expression of Axin2) were also attenuated. This point should be addressed both in vivo and in vitro. *

      We are performing immunohistochemistry and quantification of beta-catenin in control and cKO brain samples, as suggested by the reviewer. Unfortunately, we have not yet found an antibody and labelling protocol that allows us to detect nuclear beta-catenin, even in control samples, so with our current antibody, we won’t be able to show a reduction in nuclear localization of beta-catenin in the cKO samples. We are testing alternative beta-catenin antibodies that could help us overcome this limitation. As the reviewer mentions, we do see a reduction in reporter expression in BATGAL mice upon deletion of beta-catenin. In order to further demonstrate effective Wnt signalling attenuation in our mutant mice we are testing antibodies for Wnt targets such as Axin2, CcnD1 and NeuroD1.

      • Wnt/beta-catenin signals in qNSC and active NSC in vitro.*

      The authors indicated that the depletion of beta-catenin had no effect on qNSCs and active NSCs in vitro. However, it is not clear whether Wnt/beta-catenin signaling is activated in their culture conditions. If there are no inputs of Wnt signaling in cultured cells, the depletion of beta-catenin will not lead any impacts. Therefore, it would be critical to check if the Wnt-signaling is activated in control cells in their culture condition, and if the downstream targets of Wnt-signaling are downregulated in cKO qNSCs/active NSCs.

      We agree that this is an important conceptual point that needs to be clarified. From our data (see Figure S3C), we can see that deletion of beta-catenin in NSCs in vitro blocks their response to Wnt stimulation (with CHIR99021) but it did not lower the levels of Axin2. From this, we can deduce that Wnt signalling is indeed not significantly activated in proliferating NSCs in vitro, despite the expression of Wnt ligands by these cells (Figure 3). We will perform further analysis of Wnt target genes in control and cKO NSCs in vitro to confirm this observation. Of note, the lack of Wnt signalling activity in NSCs would further support our claim that Wnt is dispensable for their proliferation and maintenance. We will make this point clearer in the manuscript.

      • ChIR treatment on cKO cells.*

      The authors only use WT cells for ChIR treatment. To investigate whether the effect of ChIR come through the beta-catenin signaling pathway, why don't they use cKO NSCs for ChIR treatment (Fig5-7)?

      This is a great suggestion and we are performing these experiments with control and cKO NSCs.

      Different Wnt signaling levels between in vivo and in vitro.

      The authors indicated that different levels of Wnt signaling could results in different outcomes based on in vitro observation. What are the levels of Wnt signaling in vivo compared to in vitro ChIR treatment? Activation of Wnt/beta-catenin in vivo is much weaker than in vitro CHIR treatment, therefore the contribution of Wnt signaling at endogenous levels is negligible? This may help to explain why Wnt/beta-catenin is dispensable in vivo, at least in young state. This can be addressed by probing the levels of downstream targets.

      Levels of Wnt signalling are indeed central to our conclusions and we agree that a comparison of Wnt/beta-catenin signalling levels between our in vitro interventions and the in vivo situation would be important. However, we find that directly comparing the levels of downstream Wnt targets between the two systems might prove challenging due to differences in methodology (immunolabeling is not a reliably quantitative method, especially when performed on such different sample types, with different fixation conditions, etc). We will nevertheless attempt such quantifications using immunolabelings for CcnD1, Axin2 and NeuroD1 both in vivo and in vitro. We also want to point out that CHIR is not the only way in which we have stimulated Wnt signalling in NSCs in vitro. In Figure S5, we demonstrate that treatment with Wnt3a can reactivate quiescent neural stem cell in a dose-dependent manner, showing that the effect of Wnt signalling on NSCs can be achieved also with a more physiological intervention.

      Reviewer #2

      A major challenge is to separate cell adhesion functions of beta-catenin from its function in the canonical Wnt/beta-catenin signaling pathway. The authors tested two different conditional bcat alleles (bcatdel ex2-6 ; bcatdel ex3-6) to delete bcat from stem cells. It is a bit unfortunate that the authors chose to test two conditional alleles that would affect cell adhesion and transcriptional activity instead of the Ctnnb1dm allele (Draganova et al. 2015, Stem Cells), which would have been a cleaner way to specifically address the contribution of beta-catenin transcriptional activity in adult hippocampal neural stem cells. Was there a specific reason not to use the Ctnnb1dm conditional mice? Please comment / discuss.

      We agree with the reviewer that the Ctnnb1dm allele would better differentiate between cell adhesion and transcriptional effects of beta-catenin deletion. However, as we see no effect of beta-catenin deletion, we did not find it necessary to further dissect the differential contribution of cell adhesion and the Wnt/beta-catenin pathway in this particular case. We will add a comment on this point to the discussion.

      The authors control for downregulation of beta-catenin signaling activity in the bcatdel ex2-6 through the analysis of the BATGAL reporter. 30 days after recombination, they observe a drop in reporter activity (from 31% to 13%). While this drop shows that at the time of analysis beta-catenin signaling activity was reduced, the lack of complete downregulation of reporter activity raises the issue whether long-term stability of the b-catenin protein may be a confounding factor at this time-point. In particular effects of b-catenin on the DCX population, which to a significant extent is generated several days to weeks before the time-point of analysis, may not be revealed. Data on the time-course of downregulation of the BATGAL reporter could help for the interpretation of the data as would analysis of beta-catenin protein levels in recombined cells. In addition, analysis of bcatdel ex2-6 at a later time-point after recombination, at which beta-catenin signaling activity is further downregulated, would strengthen the surprising finding that loss of beta-catenin signaling activity does not hamper neuronal differentiation in the adult hippocampus.

      We will monitor the disappearance of beta-catenin using immunohistochemistry for beta-catenin and downstream targets of Wnt in control and cKO brains, both at P90 and at a longer time after deletion (P150), as the reviewer suggests. Of note, when we deleted beta-catenin in vitro in NSCs, we could confirm the disappearance of the protein by 48 hours, and therefore beta-catenin stability cannot explain the lack of effect of the deletion (Figure S3B).

      Was quantification performed only in recombined (i.e., reporter positive) cells or in recombined and non-recombined cells? I could not locate that information. Given the evidence for feed-back regulation from intermediate precursor cells / immature neurons to stem cells (e.g. Lavado et al. 2010, Plos Biology), it is important to separately evaluate the development of recombined and non-recombined cells to evaluate the behavior of beta-catenin signaling deficient stem cells.

      The quantifications were always performed in YFP+ recombined cells. The efficiency of recombination was very high (from 83 to 97%), therefore allowing no room for confounding effects of unrecombined cells. We will convey this information in a clearer way in our revised manuscript.

      Reports from (Kuwabara et al. 2009, Nat Neurosci), (Gao et al. 2009, Nat Neurosci) and (Karalay et al. 2011, PNAS) suggest that beta-catenin signaling activity drives dentate granule neuron identity through regulating the expression of Neurod1 and Prox1. Given that in these studies neither loss of Neurod1 nor of Prox1 affects neuronal fate commitment but long-term survival and that the studies by (Gao et al. 2007, J Neurosci) and (Heppt et al. 2020, EMBO J) suggest that loss-of-beta-catenin affects neuronal survival, it may be interesting to evaluate a) whether a dentate granule neuron identity, b) long-term survival of adult generated neurons are affected. At the minimum these studies should be more extensively discussed.

      As mentioned in our response summary, our main aim is to test the effects of Wnt/beta-catenin signalling on NSCs. Nevertheless, these are excellent suggestions and we are currently performing immunohistochemistry for NeuroD1 and Prox1 to test whether they are downregulated in cKO brain samples. We have also performed a longer deletion of beta-catenin (deletion at P60 and analysis at P150) to test whether neurogenesis is affected in the cKO mice in the longer term.

      It has been suggested that the neural stem cell population in the adult hippocampus may be heterogenous with one population being responsible for baseline neurogenesis and being resistant to age-associated depletion and a second population driving high levels of neurogenesis in young adults (see also Urban, Bloomfield and Guillemot 2019, Neuron). The observation that beta-catenin signaling is only active in a small fraction of stem cells and their progeny raises the question whether it fulfills only a function in a specific subpopulation. Such possibility should at least be discussed.

      This is a very interesting point, which we will include in the discussion of our revised manuscript. We are also performing immunohistochemistry for Id4 together with beta-catenin or downstream targets of Wnt and NSC markers to determine whether the resting population (described in Urban et al. 2016 and Harris et al. 2021), which has low levels of Id4 is more responsive to Wnt than the dormant population.

      The recently published studies by (Rosenbloom et al. 2020, PNAS) and (Heppt et al. 2020, EMBO J) strongly suggest that beta-catenin signaling dynamics are critical for the regulation / modulation of adult hippocampal neurogenesis. The aspect of beta-catenin signaling dynamics should be discussed.

      We will include a discussion of beta-catenin signalling dynamics in the revised version of the manuscript.

      **Significance:**

      Adult neurogenesis is considered an important factor in hippocampal plasticity and its disturbance is thought to contribute to the pathogenesis in several psychiatric and degenerative diseases. Wnt/beta-catenin signaling is considered central to the regulation of adult hippocampal neurogenesis. In this regard, the manuscript describes the potentially very important and surprising finding that deletion of beta-catenin from neural stem cells does not generate major neurogenesis phenotypes. The concern with the present manuscript is, that the lack of phenotype requires additional analyses to exclude that phenotypes develop with a delay because of long-term stability of the beta-catenin protein.

      We believe the revisions outlined above will address these concerns.

      The significance of the manuscript and its interest to a wider audience would in addition be greatly enhanced, if the authors could provide some mechanistic data that would explain the discrepancies between published functions of Wnt/beta-catenin-signaling dependent regulation of neurogenesis and their own findings. The manuscript would also gain significance if the authors would provide solid data for their interesting hypothesis that beta-catenin-signaling contributes to the regulation of adult hippocampal neurogenesis in response to extrinsic stimuli. In this regard one potential approach would be to analyse whether extrinsic stimuli such as running would be able stimulate the activation of stem cells.

      Both finding a mechanism to explain the observed discrepancies and demonstrating that Wnt has a role in the response of NSCs to extrinsic stimuli are excellent follow-up suggestions to our work and we thank the reviewer for these recommendations. However, addressing these points would take many months (if not years) and is not necessary to support the current conclusions of our work. We therefore believe they are out of the scope of this current manuscript.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary:

      Wnt/beta-catenin signaling is considered central to the regulation of adult hippocampal neurogenesis. In this manuscript Austin and colleagues interrogate the function of beta-catenin-dependent signaling using in vivo beta-catenin conditional knockout and gain-of-function approaches combined with in vitro pharmacological and genetic approaches. The authors confirm previous reports of Wnt/beta-catenin signaling in adult hippocampal neurogenesis and report the surprising findings that • Deletion of beta-catenin from stem cells does not affect stem cell numbers and their activation / proliferation in vivo and in vitro • Deletion of beta-catenin from stem cells does not affect neuronal differentiation in vivo and in vitro Moreover, the authors show that expression of a stabilized form of beta-catenin affects stem cell positioning in vivo and that the effects of treatment of cultured hippocampal stem/progenitor cells with a pharmacological stimulator of Wnt/beta-catenin signaling are dose and time-dependent. The authors discuss that their findings suggest that Wnt/beta-catenin signaling is dispensable for neural stem cell homeostasis and that Wnt/beta-catenin signaling may have a function in the response of stem cells to external stimuli.

      Comments:

      A major challenge is to separate cell adhesion functions of beta-catenin from its function in the canonical Wnt/beta-catenin signaling pathway. The authors tested two different conditional bcat alleles (bcatdel ex2-6 ; bcatdel ex3-6) to delete bcat from stem cells. It is a bit unfortunate that the authors chose to test two conditional alleles that would affect cell adhesion and transcriptional activity instead of the Ctnnb1dm allele (Draganova et al. 2015, Stem Cells), which would have been a cleaner way to specifically address the contribution of beta-catenin transcriptional activity in adult hippocampal neural stem cells. Was there a specific reason not to use the Ctnnb1dm conditional mice? Please comment / discuss.

      The authors control for downregulation of beta-catenin signaling activity in the bcatdel ex2-6 through the analysis of the BATGAL reporter. 30 days after recombination, they observe a drop in reporter activity (from 31% to 13%). While this drop shows that at the time of analysis beta-catenin signaling activity was reduced, the lack of complete downregulation of reporter activity raises the issue whether long-term stability of the b-catenin protein may be a confounding factor at this time-point. In particular effects of b-catenin on the DCX population, which to a significant extent is generated several days to weeks before the time-point of analysis, may not be revealed. Data on the time-course of downregulation of the BATGAL reporter could help for the interpretation of the data as would analysis of beta-catenin protein levels in recombined cells. In addition, analysis of bcatdel ex2-6 at a later time-point after recombination, at which beta-catenin signaling activity is further downregulated, would strengthen the surprising finding that loss of beta-catenin signaling activity does not hamper neuronal differentiation in the adult hippocampus.

      Was quantification performed only in recombined (i.e., reporter positive) cells or in recombined and non-recombined cells? I could not locate that information. Given the evidence for feed-back regulation from intermediate precursor cells / immature neurons to stem cells (e.g. Lavado et al. 2010, Plos Biology), it is important to separately evaluate the development of recombined and non-recombined cells to evaluate the behavior of beta-catenin signaling deficient stem cells.

      Reports from (Kuwabara et al. 2009, Nat Neurosci), (Gao et al. 2009, Nat Neurosci) and (Karalay et al. 2011, PNAS) suggest that beta-catenin signaling activity drives dentate granule neuron identity through regulating the expression of Neurod1 and Prox1. Given that in these studies neither loss of Neurod1 nor of Prox1 affects neuronal fate commitment but long-term survival and that the studies by (Gao et al. 2007, J Neurosci) and (Heppt et al. 2020, EMBO J) suggest that loss-of-beta-catenin affects neuronal survival, it may be interesting to evaluate a) whether a dentate granule neuron identity, b) long-term survival of adult generated neurons are affected. At the minimum these studies should be more extensively discussed.

      It has been suggested that the neural stem cell population in the adult hippocampus may be heterogenous with one population being responsible for baseline neurogenesis and being resistant to age-associated depletion and a second population driving high levels of neurogenesis in young adults (see also Urban, Bloomfield and Guillemot 2019, Neuron). The observation that beta-catenin signaling is only active in a small fraction of stem cells and their progeny raises the question whether it fulfills only a function in a specific subpopulation. Such possibility should at least be discussed.

      The recently published studies by (Rosenbloom et al. 2020, PNAS) and (Heppt et al. 2020, EMBO J) strongly suggest that beta-catenin signaling dynamics are critical for the regulation / modulation of adult hippocampal neurogenesis. The aspect of beta-catenin signaling dynamics should be discussed.

      Significance

      Significance:

      Adult neurogenesis is considered an important factor in hippocampal plasticity and its disturbance is thought to contribute to the pathogenesis in several psychiatric and degenerative diseases. Wnt/beta-catenin signaling is considered central to the regulation of adult hippocampal neurogenesis. In this regard, the manuscript describes the potentially very important and surprising finding that deletion of beta-catenin from neural stem cells does not generate major neurogenesis phenotypes. The concern with the present manuscript is, that the lack of phenotype requires additional analyses to exclude that phenotypes develop with a delay because of long-term stability of the beta-catenin protein.

      The significance of the manuscript and its interest to a wider audience would in addition be greatly enhanced, if the authors could provide some mechanistic data that would explain the discrepancies between published functions of Wnt/beta-catenin-signaling dependent regulation of neurogenesis and their own findings. The manuscript would also gain significance if the authors would provide solid data for their interesting hypothesis that beta-catenin-signaling contributes to the regulation of adult hippocampal neurogenesis in response to extrinsic stimuli. In this regard one potential approach would be to analyse whether extrinsic stimuli such as running would be able stimulate the activation of stem cells.

      Expertise:

      Adult neurogenesis, stem cell biology, signaling

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary

      Wnt/beta-catenin signaling has been studies in the context of adult neurogenesis for decades. It has been shown that modulation of Wnt signaling regulates adult neurogenesis, but the consequences were not always consistent. In this study, the authors developed conditional knockout mouse lines to test whether beta-catenin is essential for the regulation of adult neurogenesis.

      First, using a published single cell seq-data and a reporter TG mouse system, they validated the expression of Wnt-pathway molecules in qNSCs and active NSCs. Then, beta-catenin conditional cKO mice were analyzed. The authors did not find any changes in total number of NSCs, the activation of NSCs, and the number of IPCs as well as neuroblasts. Subsequently, using in vitro culture system, the authors addressed if the proliferation and differentiation are affected in vitro conditions. Both proliferation and activation from the quiescent state were not affected in cKO NSCs. Finally, they demonstrated that an artificial stimulation of Wnt signaling by CHIR can induce differentiation or proliferation depending on cellular states and doses, thus NSCs can respond to Wnt signaling. Based on these data, they concluded that beta-catenin is dispensable for the maintenance/activation of NSCs in vivo, although NSCs can respond to Wnt/beta-catenin signaling. Overall, the results are reliable and important for the field. However, several points need to be addressed and clarified to support their conclusion. I am hopeful that the authors find my comments helpful and constructive.

      1. Validation of cKO in vivo. Although the authors validated cKO of beta-catenin in vivo using FACS/qPCR at the transcript level, it would be important to check when and to what extent beta-catenin proteins are downregulated in qNSC/activeNSCs in vivo. This will be easily assessed by immunohistochemistry. In the same line, although the authors confirmed the reduction of beta-catenin signaling using beta-gal signaling in cKO mice, it would be important to check if this can be cross-checked by staining the nuclear localization of beta-catenin. This confirmation would strength the authors statement and clear that some remained beta-catenin at the plasma membrane may not be compensating their function. Independent of the confirmation of beta-catenin cKO, it would be important to check if the downstream targets of Wnt/beta-catenin signals (ex. Expression of Axin2) were also attenuated. This point should be addressed both in vivo and in vitro.
      2. Wnt/beta-catenin signals in qNSC and active NSC in vitro The authors indicated that the depletion of beta-catenin had no effect on qNSCs and active NSCs in vitro. However, it is not clear whether Wnt/beta-catenin signaling is activated in their culture conditions. If there are no inputs of Wnt signaling in cultured cells, the depletion of beta-catenin will not lead any impacts. Therefore, it would be critical to check if the Wnt-signaling is activated in control cells in their culture condition, and if the downstream targets of Wnt-signaling are downregulated in cKO qNSCs/active NSCs.
      3. ChIR treatment on cKO cells The authors only use WT cells for ChIR treatment. To investigate whether the effect of ChIR come through the beta-catenin signaling pathway, why don't they use cKO NSCs for ChIR treatment (Fig5-7)?
      4. Different Wnt signaling levels between in vivo and in vitro<br> The authors indicated that different levels of Wnt signaling could results in different outcomes based on in vitro observation. What are the levels of Wnt signaling in vivo compared to in vitro ChIR treatment? Activation of Wnt/beta-catenin in vivo is much weaker than in vitro CHIR treatment, therefore the contribution of Wnt signaling at endogenous levels is negligible? This may help to explain why Wnt/beta-catenin is dispensable in vivo, at least in young state. This can be addressed by probing the levels of downstream targets.

      Significance

      Significant.

      A genetic approach to address the role of Wnt/Beta-catenin signaling is critical for the field. The audience would be interested if this study make it clear previously reported discrepancy.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Response to reviewers

      We first thank Review Commons for recruiting such knowledgeable reviewers to comment on our manuscript. We appreciate their diverse set of useful and constructive comments, which should help us improve the manuscript substantially. Please see our response to each reviewer’s comments below.

      Reviewer #1:

      **Summary:** The authors describe a useful modified fluctuation assay that couples conventional mutation rate analysis with mutational spectrum characterization of forward mutations at the S. cerevisiae CAN1 locus. They nicely showed that wild yeast isolates display a wide range of mutation rates with strains AAR and AEQ displaying rates ~10-fold higher than the control lab strain. These two strains also showed a bias for C>A mutations, and were the only strains analyzed that had a mutation spectrum statistically different from the lab control. Together, these data provide a compelling proof-of-principle of the applicability of the modified fluctuation analysis approach described in this manuscript. Overall, the manuscript is very well written, and the work reported in it does represent a valuable contribution to the field. However, two primary shortcomings were identified that can be addressed to strengthen the conclusions prior to publication. Both points described below pertain to the analysis of the possible C>A specific mutator phenotype in strains AAR and AEQ.

      Response:

      We thank the reviewer for this positive response. We have made a plan, detailed below, to address the shortcomings the reviewer has highlighted.

      **Major comments:**

      1. The work presented in the manuscript does suggest that these two haploids are likely to display the C>A mutator phenotype. Yet, the authors fell short of providing a full and unambiguous demonstration that would elevate the significance of their discovery. They could have directly tested the predicted C>A specific mutator phenotype by conducting additional experiments, one of which is relatively simple. Specifically, they could have performed a simple reversion-based mutation assay to validate the reported C>A mutator phenotype displayed by AAR and AEQ. For example, into AAR, AEQ, and a wild type control, the authors could introduce an engineered auxotrophic marker allele (e.g., ura3 mutation) caused by an A to C substitution, which upon mutation back to A restores prototrophic growth in minimal media (ie. reversion from ura3-C to URA3-A). Such specific reversible allele should be relatively easy to integrate into the AAR and AEQ genomes, as well as in the control strain. Based on the authors' prediction, AAR and AEQ should display a very large increase (far higher than 10 fold) in the reversion rate when compared to a control haploid. To demonstrate the specificity of the mutation spectrum, the authors could test the reversion rates of a different engineered allele requiring a reversion mutation in the opposite direction (ie. reversion from ura3-A to URA3-C). If the AAR and AEQ mutator is specific C>A, one would predict that all three strains should have similar mutation rates for a reversion in the A>C direction. This additional genetic work would thoroughly validate the central discovery and would reinforce the usefulness of the method described in the manuscript.

      Alternatively, a conventional mutation accumulation and whole genome re-sequencing experiment with parallel lines of AAR, AEQ and a control strain would also very effectively validate the C>A mutator prediction, and it would also answer the authors' discussion point about specificity to the CAN1 locus. However, it would be more costly and much more time consuming.

      Response:

      We thank the reviewer for these detailed, clear suggestions regarding additional methodology for further validating our results. We appreciate that parallel independent validations always add credibility to unexpected results like the ones presented in our manuscript. We’ve been considering these suggestions seriously, but our concern is that it is much less straightforward to engineer the genomes of these wild yeast than one might expect based on experiments with standard laboratory strains. Unforeseen roadblocks related to the biology of AAR and AEQ could end up making the URA3 reversion assay take even longer than an MA study. As we understand it, the two main concerns that might necessitate this additional undertaking are that either our novel assay for ascertaining mutations in CAN1 doesn’t work properly, or that the mosaic beer strains mutate significantly differently outside CAN1. Below we describe revisions to the text that we think will clearly represent these caveats and the relatively modest uncertainty associated with them.

      To further justify the soundness of our claim that AAR and AEQ have distinctive mutation rates and spectra, we plan to add additional discussion of the validation approaches that are presented in the manuscript to verify the accuracy of our pipeline. Although the ability of fluctuation assays to estimate mutation rates is well established, the identification of the spectra using our next-generation-sequencing-based pipeline is novel, so we used Sanger sequencing to validate the exact de novo mutations it ascertained in a select control strain. Our Sanger sequencing test found our assay to have an undetectably low false positive rate and a false negative rate that was much too low to account for the differences we measured between AAR, AEQ, and the standard lab strains. The fact that we also observed similar mutation spectra from control lab strains used in previous CAN1-based studies further demonstrates the reliability of our method, and it is notable that most natural isolates were measured to have very similar mutation spectra to lab strains (Figure 4 and Supplementary Figure S8-S9). We agree that further validation would be needed to read much into the more subtle differences in mutation rates and spectra that we saw hints of between other strains, and for that reason, we focused this paper on the differences that well exceed what we measured to be our measurement pipeline’s margin of error.

      It is true that the genome-wide mutation rate might differ somewhat from the mutation rate at the CAN1-locus, but the mutation spectrum at the CAN1 locus measured in a previous study (Lang and Murray, 2008) was very similar to the genome-wide mutation spectra obtained from MA studies (Sharp et al., 2018), with just a small overall increase of mutations with C/G nucleotides (the second to last paragraph on page 17 and Supplementary Figure S13). Moreover, we have avoided making any claims of seeing distinct mutation rates or spectra based on “apples-to-oranges” comparisons between mutation spectra measured at CAN1 and spectra measured across the whole genome.

      We also note that the enrichment of C>A mutations in AEQ and AAR is not only observed from our de novo mutation data in CAN1, but also seen in rare natural polymorphisms genome-wide (Figure 1B, 5A,B). Rare natural polymorphisms are recent mutations that occurred during the history of the strain, and the fact that they disproportionately enrich in C>A mutations in these strains indirectly shows that the C>A enrichment occurs not only at CAN1, as measured in our experiment, but has also been occurring during natural mutation accumulation genome-wide.

      The second concern is in regard to the relatively extensive conclusions drawn about the possible evolutionary significance of the possible C>A mutator in AAR and AEQ. The authors should be more cautious and conservative in the proposed interpretation. As the authors note:

      'Three of the four C>A-enriched mosaic beer strains, AAR, AEQ, and SACE_YAG, are all haploid derivatives of the [highly heterozygous] diploid Saccharomyces cerevisiae var diastaticus strain CBS1782, which was isolated in 1952 from super-attenuated beer.'

      From this statement, and because the paper cited provided few details on the isolation of CBS1782, it is presumed that these haploid derivatives were most likely isolated as recombinant spores. Furthermore, it is unclear when this isolation occurred, and for how many generations strains AAR and AEQ have been propagated in a haploid state.

      Herein lies a critical point: AAR and AEQ were recently derived from a diploid background with a "high level of heterozygosity". In a heterozygous diploid context, deleterious point mutations (and any resulting mutator phenotypes) would likely be masked by the presence of wild-type alleles. Now, as haploids, they express a novel genotype (i.e., combination of defective or incompatible parental alleles), which manifests as a mutator phenotype. In this respect, AAR and AEQ appear analogous to the spore derivatives of the incompatible cMLH1-kPMS1 isolate referred to in the manuscript as a notable exception. The analysis of strains harboring incompatible MLH1-PMS1 mutations by Raghavan et al. demonstrated that the heterozygous diploid parents were not themselves mutators, but that haploid spores which had inherited the pair of incompatible alleles displayed mutator phenotype. Collectively, while it can certainly be argued that the strains AAR and AEQ (like the MLH1/PMS1 incompatible strains) are mutators now, this fact alone does not support the conclusion that they have adapted to survive the expression of an extant mutator phenotype. This premise could be tested by analyzing the mutation rates/spectra of four new spores derived from a single tetrad of CBS 1782. Do the four sibling spores display similar or different mutational rates and spectra? If all four spores from a single tetrad exhibit the 10-fold increase in CAN1 mutation rate and the C>A transversion bias, then it can be inferred that the diploid parent is also a mutator in the same manner. Further direct analysis of mutation rates and spectrum in the parent diploid CBS 1782 would complete the work. This finding would be quite significant, and would provide strong evidence that wild strains can in fact tolerate the expression of a chronic mutator allele.

      Response:

      We thank the reviewer for suggesting additional study of the ancestral diploid strain CBS 1782, and we agree this could add a lot to the manuscript, especially given the high level of heterozygosity in the diploid and the link to the previous MLH1-PMS1 incompatibility story. We have obtained a sample of CBS 1782 and plan to knock out its HO locus using CRISPR, perform tetrad dissection of spores freshly derived from the diploid, and then measure mutation rates and spectra in all four segregants derived from a single tetrad (provided that all four spores end up growing). We plan to collect and sequence about 50 mutations to get qualitative results on the mutation rates and spectra of these segregants. We also plan to sequence the whole genome of the strain CBS 1782 and examine polymorphisms together with the 1011 strains to check for any signal of C>A enrichment. We recognize that our pipeline as currently implemented will not let us directly measure the mutation spectrum of the diploid, which is inaccessible to our pipeline given its two functional copies of CAN1 and the recessive nature of canavanine resistance. That being said, the elevation of the C>A fraction in natural polymorphisms found in AAR and AEQ provides evidence for prolonged activity of the mutator phenotype in the wild and/or in the domesticated environment from which CBS 1782 was derived. However, we acknowledge we have limited information about how these haploids were propagated before they were banked.

      **Minor comments:** A final, relatively minor point. That the new haploids AAR and AEQ show distinct mutation rates and spectra opens the door to an interesting line of inquiry, which may help to identify the causative mutator allele in a manner more efficient than searching for missense mutations. It is stated, and it is understandable, that the identification of the possible causal mutations is beyond the scope of the present manuscript. In this spirit, it would be much more appropriate to restrict such considerations to the Discussion section. Specifically, while the authors make a plausible case for OGG1 being a candidate gene responsible for the C>A mutator phenotype, no experimental demonstration was attempted. As such, that text segment should be moved from the Results to the Discussion section.

      Response:

      We agree with the reviewer of lacking genetic evidence on OGG1 in the current manuscript and we will move that section from the results to the discussion. Future work is underway to test and identify the causal loci for the mutator phenotype.

      Reviewer #1 (Significance (Required)): As stated in the summary section above, the manuscript by Jiang et al represents a substantial contribution to the fields of genome stability and genome evolution. The method described is likely to be useful beyond budding yeast. The work will be appreciated by a broad audience of geneticists. The additional work and text modifications proposed above would likely further elevate the impact of this work.

      Response:

      We are very grateful for this generous assessment and we likewise hope our planned revisions will further elevate the paper’s potential impact.

      Reviewer #2:

      Mutation is a fundamental force in organismal evolution, and therefore understanding the evolution of mutational mechanisms are important in evolutionary studies. In this manuscript, the authors used strains of S. cerevisiae as a model system to study the variations of rates and spectra in mutations with bioinformatic and experimental approaches. First, the authors analyzed the polymorphism data from 1011 strains by PCA analysis and show the variations in spectra. Second, the authors used fluctuation test combined with deep sequencing of the resistance gene to identify mutation rates and spectra in 18 strains, which show ~10-fold mutation rate variations and increased C-to-A mutations in two strains.

      For the second part, the experimental procedures and statistical analysis are mostly solid. For the first part, as what authors said in the introduction, polymorphism is not equal to the mutation spectra. I think the authors did a good job by being cautious in the wording and having no over-inference after the analysis. It is thus inevitable that the conclusion of this part sounds mostly descriptive. The overall writing is very clear. I will recommend the publication in field-specific journals.

      Response:

      We thank the reviewer for these positive comments. We will address each minor point below.

      **Minor comments:** P9 - It is very hard to not wonder how the 16 strains were picked in the fluctuation tests. Some comments on that will be appreciated. E.g., was that informed by the results of Fig 1?

      Response:

      We actually did not pick strains based on the results of Figure 1, one reason being that the CAN1 reporter method only works on haploid strains with a canavanine sensitivity phenotype. We also restricted our analysis to strains without known aneuploidies to maximize our ability to accurately measure the spectra of the strains’ polymorphisms. When possible, given these constraints, we included at least two randomly selected strains from each clade of the 1011 collection whenever possible. These constraints are currently explained on the second to last paragraph on page 9, and will be explained in more detail in revision.

      P17- In the paragraph "natural selection might contribute ..." , is there any example of "certain mutation types are more often beneficial than others"?

      Response:

      One example of this is that transitions are more often synonymous than transversions are (Freeland and Hurst, 1998), and mutations that create or destroy CpG sites are more likely to alter gene regulation than other mutation types are (in species other than yeast where CpGs are methylated). We recognize that these effects are likely not large, which is one reason we don’t think natural selection is a great explanation for mutation spectrum difference among groups.We will mention these examples explicitly in the revised text.

      P20 - Extra ')' in the sentence "Adjacent indels were merged if their frequencies differed by less than 10%)."

      Response:

      We will fix this in revision.

      In the discussion, it might be good to add a paragraph to compare the rate and spectra reported here and the ones found by MA and then NGS approach(e.g., Zhu et al. 2014).

      Response:

      We’ll be sure to add a reference to the Zhu et al. (2014) spectrum in the discussion, extending our existing comparison of mutation spectra previously reported using CAN1 (Lang and Murray, 2008) and the MA approach (Sharp et al., 2018) (currently discussed on the second to last paragraph on page 17, Supplementary Figure S13). Our CAN1 method also obtains results that are consistent with the Lang et al 2008 study on the same control strain (the last paragraph on page 11).

      Reviewer #2 (Significance (Required)): The significance of this manuscript will be relatively specific to evolutionary biologists and geneticists, especially those who use yeasts as a model system. For example, I expect the variation of mutation rates and spectra found in this manuscript will impact the following population-genetic analysis in this collection of 1011 strains and motivate more studies on the molecular machineries which affect mutation rates and spectra.

      In addition, in terms of methodological novelty, adding a novel step of reporter-gene sequencing is a reasonable way to get some information on mutation spectra as it is less labor-intensive than NGS of MAs. Other statistical or experimental procedures in this manuscript mostly follow the approaches which have been developed in previous literature and thus show not much novelty.

      Response:

      We thank the reviewer for this positive assessment. Since evolutionary biology, population genetics, and model organism genetics are three of eLife’s major focus areas, we are hoping to communicate our results to this journal’s broad audience rather than restrict ourselves to a journal focusing too narrowly on just one of these focus areas.

      Reviewer #3:

      **Summary** The authors show that certain yeast strains have altered mutation rates/bias. The study is well motivated, genetic variation in mutation rates are not easily uncovered, and capitalizes on yeast and a high-throughput mutation rate/bias method that validates findings of C>A bias from yeast polymorphism data. The results are solid and clearly presented and I have no major concerns.

      Response:

      We are very grateful for this positive response. Please find our response to each minor comment below.

      **Major comments** None

      **Minor comments** Should have comma: "In addition, environmental ..."

      Response:

      We will fix this in revision.

      Using S. paradoxus to classify derived vs ancestral alleles may not work as well as allele frequency. A 1/100 rare variant is 100x more likely derived than common variant. But with S. paradoxus divergence of say 5%, 5% polymorphic sites are misclassified or NA. Of course, since you used both, this is not a concern. But the number of variants included/excluded in each analysis should be reported. Also, I was a bit surprised that the rare variants are more noisy since most variants are rare.

      Response:

      We agree that the heuristic of classifying rare alleles as derived will do the right thing the majority of the time, but this could potentially create artifactual differences between the mutation spectra of different populations because the exact ratio of rare derived alleles to common derived alleles depends on the population’s demographic history and true site frequency spectrum. If two populations had the same mutation spectrum but very different proportions of variants that are polarized incorrectly, this could create the appearance of a mutation spectrum difference where none exists. In the revision, we will be sure to report the total number of variants filtered because of the variation present in S. paradoxus.

      The reviewer is right to point out that rare variants are generally more abundant than common variants, but this pattern is much more pronounced in a species like humans that has undergone recent population expansion than it appears to be in S. cerevisiae, which appears to have a higher proportion of older, shared variation. We hope this clarifies why the rare variant mutation spectrum PCA appears noisier than the plot made from variation across more frequency categories.

      In regards to variation in mutation rate based on canavanine resistanct. There is a caveat that some strains may be more canavanine resistant - due to differences in transporter abundanced or some other aspect of metabolism. Thus, the same mutation would survive and grow (barely) in one strain background, but not another. This caveat is very unlikely to have much of an impact but it would be worth discussing.

      Response:

      Thanks for pointing this out. We also considered the possibility that our mutation rate estimates could be confounded by slight differences in canavanine resistance between strains, and will address this point in the discussion.

      The explanation for synonymous mutations is hitchhikers or errors. However, they could also disrupt translation, here's one possibility PMC4552401.

      Response:

      Thanks for pointing this out. We will expand our statement on the possible significance of synonymous mutations to include modification of transcription and translation efficiency.

      Are there CAN allele differences between strains? If there are some, it might be worth mentioning why you do/don't think this influences the mutation rate. E.g. CGG is one step from stop but CGT is not.

      Response:

      The reviewer makes a good point that there are segregating differences among these strains in the sequence of CAN1. We plan to add an analysis where we calculate the number of opportunities for missense mutations and nonsense in each strain, as a function of its CAN1 sequence, to put a bound on the amount that these differences could affect our estimates of mutation rates in each strain.

      For the allele counts in Figure 5B. 2 indicates a variant is present in one strain so there are only 9 mutations present in AAR and not found in ANY other strain or just not found in the four listed? Likewise AAR has 36 for count 4, meaning that there are 36 variants present in AAR and one other strain, where other strains are just the 4 shown in the table, or other strains being any of the 1011?

      Response:

      The allele count in Figure 5B represents the number of times the derived allele is present in the whole population. In this case, the whole population refers to the 1011 strains minus 336 strains that are so closely related to other strains in the panel that they are effectively duplicates. An allele of count 2 might be homozygous in AAR and absent from all other strains, or present as one heterozygous copy in AAR as well as one heterozygous copy in another strain. We will explain this more clearly in the revised manuscript.

      "To our knowledge, this is one of the first" This is an odd way to put it and could be rephrased. As it stand you are either the first and not knowledgeable or knowledgeable and not the first.

      Response:

      Thanks. We will revise this to state that to our knowledge, we are the first to report such a discovery.

      "humans, great apes, .." Could you put the citations in the discussion too. I was a little surprised there was no mention of C>A bias as it relates to studies in bacteria and cancer, where there has been a lot of work on mutational spectra. A comment on this literature or whether the C>A biases are not found elsewhere would be nice.

      Response:

      We will add citations and discussion of bacteria and cancer in the revised manuscript. The reviewer is right to point out that C>A mutations do come up in cancer signatures, for example in familial adenomatous polyposis disorders where excision repair of 8-oxoguanine is compromised.

      Reviewer #3 (Significance (Required)):

      I am an evolutionary geneticist with expertise in genomics and bioinformatics. In addition to reviewing papers I also regularly handle papers as an editor. The manuscript provides rare insight into population variation in mutation rates. While differences in mutational biases are well known between species and in some cases within a species, we typically don't know what causes this biases. Environmental factors are often thought to be involved; this work clearly shows that genetic (mutator strains) exist and impact polymorphism in yeast. The manuscript does a nice job in the introduction of explaining the background on mutation rate research and motivation for the work. It also clear explains the advantage of an experimental highthroughput mutation rate/spectra approach. Thus, I believe this new angle on a long-standing problem will be of interest to the community of evolutionary geneticists outside of yeast researchers.

      Response:

      We appreciate this very generous assessment, thank you!

      Reference

      Freeland, S. J. and Hurst, L. D. (1998) ‘The genetic code is one in a million’, Journal of molecular evolution, 47(3), pp. 238–248.

      Lang, G. I. and Murray, A. W. (2008) ‘Estimating the Per-Base-Pair Mutation Rate in the Yeast Saccharomyces cerevisiae’, Genetics, 178(1), pp. 67–82.

      Sharp, N. P. et al. (2018) ‘The genome-wide rate and spectrum of spontaneous mutations differ between haploid and diploid yeast’, Proceedings of the National Academy of Sciences of the United States of America, 115(22), pp. E5046–E5055.

      Zhu, Y. O. et al. (2014) ‘Precise estimates of mutation rate and spectrum in yeast’, Proceedings of the National Academy of Sciences of the United States of America, 111(22), pp. E2310–8.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Summary

      The authors show that certain yeast strains have altered mutation rates/bias. The study is well motivated, genetic variation in mutation rates are not easily uncovered, and capitalizes on yeast and a high-throughput mutation rate/bias method that validates findings of C>A bias from yeast polymorphism data. The results are solid and clearly presented and I have no major concerns.

      Major comments

      None

      Minor comments

      Should have comma: "In addition, environmental ..."

      Using S. paradoxus to classify derived vs ancestral alleles may not work as well as allele frequency. A 1/100 rare variant is 100x more likely derived than common variant. But with S. paradoxus divergence of say 5%, 5% polymorphic sites are misclassified or NA. Of course, since you used both, this is not a concern. But the number of variants included/excluded in each analysis should be reported. Also, I was a bit surprised that the rare variants are more noisy since most variants are rare.

      In regards to variation in mutation rate based on canavanine resistanct. There is a caveat that some strains may be more canavanine resistant - due to differences in transporter abundanced or some other aspect of metabolism. Thus, the same mutation would survive and grow (barely) in one strain background, but not another. This caveat is very unlikely to have much of an impact but it would be worth discussing.

      The explanation for synonymous mutations is hitchhikers or errors. However, they could also disrupt translation, here's one possibility PMC4552401.

      Are there CAN allele differences between strains? If there are some, it might be worth mentioning why you do/don't think this influences the mutation rate. E.g. CGG is one step from stop but CGT is not.

      For the allele counts in Figure 5B. 2 indicates a variant is present in one strain so there are only 9 mutations present in AAR and not found in ANY other strain or just not found in the four listed? Likewise AAR has 36 for count 4, meaning that there are 36 variants present in AAR and one other strain, where other strains are just the 4 shown in the table, or other strains being any of the 1011?

      "To our knowledge, this is one of the first" This is an odd way to put it and could be rephrased. As it stand you are either the first and not knowledgeable or knowledgeable and not the first.

      "humans, great apes, .." Could you put the citations in the discussion too. I was a little surprised there was no mention of C>A bias as it relates to studies in bacteria and cancer, where there has been a lot of work on mutational spectra. A comment on this literature or whether the C>A biases are not found elsewhere would be nice.

      Significance

      I am an evolutionary geneticist with expertise in genomics and bioinformatics. In addition to reviewing papers I also regularly handle papers as an editor. The manuscript provides rare insight into population variation in mutation rates. While differences in mutational biases are well known between species and in some cases within a species, we typically don't know what causes this biases. Environmental factors are often thought to be involved; this work clearly shows that genetic (mutator strains) exist and impact polymorphism in yeast. The manuscript does a nice job in the introduction of explaining the background on mutation rate research and motivation for the work. It also clear explains the advantage of an experimental highthroughput mutation rate/spectra approach. Thus, I believe this new angle on a long-standing problem will be of interest to the community of evolutionary geneticists outside of yeast researchers.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Mutation is a fundamental force in organismal evolution, and therefore understanding the evolution of mutational mechanisms are important in evolutionary studies. In this manuscript, the authors used strains of S. cerevisiae as a model system to study the variations of rates and spectra in mutations with bioinformatic and experimental approaches. First, the authors analyzed the polymorphism data from 1011 strains by PCA analysis and show the variations in spectra. Second, the authors used fluctuation test combined with deep sequencing of the resistance gene to identify mutation rates and spectra in 18 strains, which show ~10-fold mutation rate variations and increased C-to-A mutations in two strains.

      For the second part, the experimental procedures and statistical analysis are mostly solid. For the first part, as what authors said in the introduction, polymorphism is not equal to the mutation spectra. I think the authors did a good job by being cautious in the wording and having no over-inference after the analysis. It is thus inevitable that the conclusion of this part sounds mostly descriptive. The overall writing is very clear. I will recommend the publication in field-specific journals.

      Minor comments:

      P9 - It is very hard to not wonder how the 16 strains were picked in the fluctuation tests. Some comments on that will be appreciated. E.g., was that informed by the results of Fig 1?

      P17- In the paragraph "natural selection might contribute ..." , is there any example of "certain mutation types are more often beneficial than others"?

      P20 - Extra ')' in the sentence "Adjacent indels were merged if their frequencies differed by less than 10%)." In the discussion, it might be good to add a paragraph to compare the rate and spectra reported here and the ones found by MA and then NGS approach(e.g., Zhu et al. 2014).

      Significance

      The significance of this manuscript will be relatively specific to evolutionary biologists and geneticists, especially those who use yeasts as a model system. For example, I expect the variation of mutation rates and spectra found in this manuscript will impact the following population-genetic analysis in this collection of 1011 strains and motivate more studies on the molecular machineries which affect mutation rates and spectra.

      In addition, in terms of methodological novelty, adding a novel step of reporter-gene sequencing is a reasonable way to get some information on mutation spectra as it is less labor-intensive than NGS of MAs. Other statistical or experimental procedures in this manuscript mostly follow the approaches which have been developed in previous literature and thus show not much novelty.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary:

      The authors describe a useful modified fluctuation assay that couples conventional mutation rate analysis with mutational spectrum characterization of forward mutations at the S. cerevisiae CAN1 locus. They nicely showed that wild yeast isolates display a wide range of mutation rates with strains AAR and AEQ displaying rates ~10-fold higher than the control lab strain. These two strains also showed a bias for C>A mutations, and were the only strains analyzed that had a mutation spectrum statistically different from the lab control. Together, these data provide a compelling proof-of-principle of the applicability of the modified fluctuation analysis approach described in this manuscript. Overall, the manuscript is very well written, and the work reported in it does represent a valuable contribution to the field. However, two primary shortcomings were identified that can be addressed to strengthen the conclusions prior to publication. Both points described below pertain to the analysis of the possible C>A specific mutator phenotype in strains AAR and AEQ.

      Major comments:

      1. The work presented in the manuscript does suggest that these two haploids are likely to display the C>A mutator phenotype. Yet, the authors fell short of providing a full and unambiguous demonstration that would elevate the significance of their discovery. They could have directly tested the predicted C>A specific mutator phenotype by conducting additional experiments, one of which is relatively simple. Specifically, they could have performed a simple reversion-based mutation assay to validate the reported C>A mutator phenotype displayed by AAR and AEQ. For example, into AAR, AEQ, and a wild type control, the authors could introduce an engineered auxotrophic marker allele (e.g., ura3 mutation) caused by an A to C substitution, which upon mutation back to A restores prototrophic growth in minimal media (ie. reversion from ura3-C to URA3-A). Such specific reversible allele should be relatively easy to integrate into the AAR and AEQ genomes, as well as in the control strain. Based on the authors' prediction, AAR and AEQ should display a very large increase (far higher than 10 fold) in the reversion rate when compared to a control haploid. To demonstrate the specificity of the mutation spectrum, the authors could test the reversion rates of a different engineered allele requiring a reversion mutation in the opposite direction (ie. reversion from ura3-A to URA3-C). If the AAR and AEQ mutator is specific C>A, one would predict that all three strains should have similar mutation rates for a reversion in the A>C direction. This additional genetic work would thoroughly validate the central discovery and would reinforce the usefulness of the method described in the manuscript.

      Alternatively, a conventional mutation accumulation and whole genome re-sequencing experiment with parallel lines of AAR, AEQ and a control strain would also very effectively validate the C>A mutator prediction, and it would also answer the authors' discussion point about specificity to the CAN1 locus. However, it would be more costly and much more time consuming.

      1. The second concern is in regard to the relatively extensive conclusions drawn about the possible evolutionary significance of the possible C>A mutator in AAR and AEQ. The authors should be more cautious and conservative in the proposed interpretation. As the authors note:

      'Three of the four C>A-enriched mosaic beer strains, AAR, AEQ, and SACE_YAG, are all haploid derivatives of the [highly heterozygous] diploid Saccharomyces cerevisiae var diastaticus strain CBS1782, which was isolated in 1952 from super-attenuated beer.'

      From this statement, and because the paper cited provided few details on the isolation of CBS1782, it is presumed that these haploid derivatives were most likely isolated as recombinant spores. Furthermore, it is unclear when this isolation occurred, and for how many generations strains AAR and AEQ have been propagated in a haploid state.

      Herein lies a critical point: AAR and AEQ were recently derived from a diploid background with a "high level of heterozygosity". In a heterozygous diploid context, deleterious point mutations (and any resulting mutator phenotypes) would likely be masked by the presence of wild-type alleles. Now, as haploids, they express a novel genotype (i.e., combination of defective or incompatible parental alleles), which manifests as a mutator phenotype. In this respect, AAR and AEQ appear analogous to the spore derivatives of the incompatible cMLH1-kPMS1 isolate referred to in the manuscript as a notable exception. The analysis of strains harboring incompatible MLH1-PMS1 mutations by Raghavan et al. demonstrated that the heterozygous diploid parents were not themselves mutators, but that haploid spores which had inherited the pair of incompatible alleles displayed mutator phenotype. Collectively, while it can certainly be argued that the strains AAR and AEQ (like the MLH1/PMS1 incompatible strains) are mutators now, this fact alone does not support the conclusion that they have adapted to survive the expression of an extant mutator phenotype. This premise could be tested by analyzing the mutation rates/spectra of four new spores derived from a single tetrad of CBS 1782. Do the four sibling spores display similar or different mutational rates and spectra? If all four spores from a single tetrad exhibit the 10-fold increase in CAN1 mutation rate and the C>A transversion bias, then it can be inferred that the diploid parent is also a mutator in the same manner. Further direct analysis of mutation rates and spectrum in the parent diploid CBS 1782 would complete the work. This finding would be quite significant, and would provide strong evidence that wild strains can in fact tolerate the expression of a chronic mutator allele.

      Minor comments:

      A final, relatively minor point. That the new haploids AAR and AEQ show distinct mutation rates and spectra opens the door to an interesting line of inquiry, which may help to identify the causative mutator allele in a manner more efficient than searching for missense mutations. It is stated, and it is understandable, that the identification of the possible causal mutations is beyond the scope of the present manuscript. In this spirit, it would be much more appropriate to restrict such considerations to the Discussion section. Specifically, while the authors make a plausible case for OGG1 being a candidate gene responsible for the C>A mutator phenotype, no experimental demonstration was attempted. As such, that text segment should be moved from the Results to the Discussion section.

      Significance

      As stated in the summary section above, the manuscript by Jiang et al represents a substantial contribution to the fields of genome stability and genome evolution. The method described is likely to be useful beyond budding yeast. The work will be appreciated by a broad audience of geneticists. The additional work and text modifications proposed above would likely further elevate the impact of this work.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Answers to the reviewers’ comments

      We deeply appreciate the reviewers for their thoughtful, critical and constructive comments, which have undoubtedly provided us with valuable opportunities to improve our manuscript.

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      Extravasation of lymphocytes from HEV in the lymph nodes is mediated by the interaction between lymphocyte L-selectin and PNAd-carrying sulfated sugars expressed by HEVs. Multiple steps of lymphocyte migration interacting with ECs at the luminal side of HEVs have been studied intensively; however, post-luminal migration steps are unclear. In this study, using intravital confocal microscopy of peripheral lymph nodes (pLNs), the authors found that GlcNAc6ST1 deficiency, required for sulfation of PNAd, delays trans-fibroblastic reticular cell (FRC) migration of lymphocytes, and hot spots of trans-HEV EC migration and trans-FRC migration. Interestingly, hot spots of trans-FRC migration are often associated with dendritic cells (DCs). Thus, the authors concluded that FRCs delicately regulate the transmigration of T and B cells across the HEV wall, which could be mediated by perivascular DCs.

      **Main comments**

      1. This study focused on pLNs, which are quite different from mesenteric lymph nodes (mLNs) in many ways. The authors should include mLNs in their study to make the general statement with regard to the T/B cell entry into lymph nodes. In addition, it will be more significant if this study includes challenged pLNs.

      We thank the reviewer for raising the important point. We agree that mesenteric lymph nodes are quite different from peripheral lymph node that this study focuses on. Therefore, we specified the popliteal or peripheral lymph node in the revised manuscript as follows.

      In the Abstract (page 2), “… Herein, we performed intravital imaging to investigate post-luminal T and B cell migration in popliteal lymph node, consisting of trans-EC migration, crawling in the perivascular channel (a narrow space between ECs and FRCs) and trans-FRC migration. … These results suggest that HEV ECs and FRCs with perivascular DCs delicately regulate T and B cell entry into peripheral lymph nodes.”

      In the Introduction (page 4), “Herein, we clearly visualized the multiple steps of post-luminal T and B cell migration in popliteal lymph node, including trans-EC migration, intra-PVC crawling and trans-FRC migration, using intravital confocal microscopy and fluorescent labelling of ECs and FRCs with different colours.

      In the Discussion (page 21), “… These results imply that pericyte-like FRCs, the second cellular barrier of HEVs, regulate the entry of T and B cells to maintain peripheral lymph node homeostasis more precisely and restrictively than we previously thought.”

      In addition, we discussed the difference in lymphocyte migration across HEVs between peripheral lymph node, mesenteric lymph node, and peyer’s patches in the Discussion of the revised manuscript. We also discussed inflamed lymph nodes in the Discussion as follows.

      In the Discussion (page 20), “… Although this work focused on peripheral lymph node, the other lymphoid organs have different lymphocyte homing efficiency61 due to organ-specific gene expression on HEVs62. B cells home better to mesenteric lymph nodes and peyer’s patches than peripheral lymph nodes61 by CD22-binding glycans expressed preferentially on the HEVs of mesenteric lymph nodes and peyer’s patches62.

      Inflamed peripheral lymph node become larger by recruiting more lymphocytes and even L-selectin-negative leukocytes that are excluded in the steady state63,64. Inflamed HEV ECs show different gene expression, such as downregulation of GLYCAM1 and GlcNAc6ST-160. In addition, inflamed HEV integrity may be loosen due to markedly increased leukocyte influx although the HEV FRCs can prevent bleeding by interacting with platelet CLEC-248. CD11c+ DCs are associated with inflamed HEV EC proliferation that is functionally associated with increased leukocyte entry65. The stepwise migration of lymphocyte across inflamed HEVs and their hot spots with perivascular CD11+ DCs will be interesting topic for future study.”

      The finding that GlcNAc6ST1 deficiency delays lymphocyte trans-FRC migration but not trans-HEV EC migration is surprising. However, the reason this occurs is neither shown nor discussed. Is GlcNAc6ST1 also expressed in FRCs? Or does GlcNAc6ST1 expression on HEV license lymphocytes to transmigrate across FRCs?

      This is valid point to be addressed. GlcNAc6ST-1 is predominantly involved in PNAd expression on the abluminal side rather than on the luminal side. Therefore, our results that GlcNAc6ST-1 deficiency increased the time required for trans-FRC migration but not that for trans-EC migration, could be attributable to deficiency of GlcNAc6ST-1-synthesizing L-selectin ligands in the abluminal side of HEV.

      In addition to PNAd expression in the luminal and abluminal sides of endothelial cells in HEV, PNAd expression has been observed in reticular network close to HEV as following figures. We believe that PNAds are expressed in FRCs close to HEV and can affect lymphocyte migration such as trans-FRC migration and parenchymal migration. By looking at the data (Table S1, Rodda et al., Immunity 2008), GlcNAc6ST-1 (Chst2) is expressed in T-cell-zone reticular cells while GlcNAc6ST-2 (Chst4) is absent. Therefore, it is presumable that FRC-expressed GlcNAc6ST1 may regulate trans-FRC migration in some extent.

      Figures. PNAD expression on HEVs (arrows) and reticular network (arrow heads) close to the HEVs

      We included these points in the Discussion of the revised manuscript (page 15) as follows.

      “… GlcNAc6ST-1 is predominantly involved in PNAd expression on the abluminal side rather than on the luminal side, although GlcNAc6ST-1 deficiency also modestly affects the luminal migration of lymphocytes by increasing the rolling velocity9. GlcNAc6ST-1 deficiency increased the time required for trans-FRC migration but not that for trans-EC migration. This could be attributable to deficiency of GlcNAc6ST-1-synthesizing L-selectin ligands in the abluminal side of HEV. In addition to the abluminal side of HEV endothelial cells, FRCs also express GlcNAc6ST-1, but not GlcNAc6ST-227, implying that FRC-expressed GlcNAc6ST-1 may regulate trans-FRC migration in some extent. … Thus, PNAds expressed at the endothelial junction and on the abluminal side of HEVs facilitate the efficient transmigration of lymphocytes across the HEV wall but do not slow transmigration in the perivascular region. GlcNAc6ST-1 deficiency and MECA79 antibody also decreased the parenchymal B and T cell velocities immediately after extravasation, respectively, probably because of blockade of parenchymal expression of PNAd in close proximity to HEV6,21,28.”

      Because of the adoptive transfusion experiment, the actual number of transmigrating lymphocytes in Fig. 3F is underestimated.

      We agree with the reviewer’s comment. We corrected the y-axis label in Fig. 3F from ‘average number of cells transmigrating at one site’ to ‘average number of labeled cells transmigrating at one site.’

      Whether DCs covering FRCs have a role for lymphocyte trans-migration is not shown.

      We leaved this work as future research and discussed about the potential mechanisms in the Discussion (page 17-18) that the DC may regulate lymphocyte entering by interacting FRC with LTβR or CLEC-2 signaling. We also included ‘Martinez et al Cell Rep 2019 (ref.51)’ in the discussion of the revised manuscript (page 18). In addition, we also discussed about better characterization of the CD11c+ DC in the Discussion of the revised manuscript (page 19) as follows.

      In the Discussion (page 18), “The podoplanin of FRCs also controls FRC contractility49,50 and ECM production51 by interacting with the CLEC-2 of DCs in inflamed lymph nodes. In the steady state, resident DCs in lymph nodes express CLEC-252. Thus, it is conceivable that CLEC-2+ resident DCs may control the contractility of FRCs and remodel ECM surrounding HEVs to facilitate the trans-FRC migration of T and B cells. Thus, the CLEC-2/podoplanin signalling may represent a key molecular mechanism underlying our discovery that trans-FRC migration hot spots preferentially occur at FRCs covered by CD11c+ DCs.”

      In the Discussion (page 19), “… In addition, better characterization of the CD11c+ DCs located in the hot spots of HEVs is required to differentiate them from the other CD11c+ DCs observed in the non-hot-spot regions of HEVs. Some T-cell-zone resident macrophages can also express CD11c54. Imaging of a triple-transgenic mouse with Zbtb46-cre;tdTomato and CD11b-GFP will be able to differentiate 3 types of DCs and macrophages potentially associated with the hot spots: Zbtb46+CD11b- cDC1, Zbtb46+CD11b+ cDC2, and Zbtb46-CD11b+ macrophage54,55.”

      In Fig. 1, time required for trans HEV EC migration and trans-FRC migration of T cells is shorter than that of B cells; however, this finding is not observed in Fig. 2C and E.

      Although the statistical comparison between T and B cells are not shown in Fig. 2C-F and S5., there are actually significant difference between T and B cells, which are similar results as Fig. 1 except for the dwell time in PVC. P values between T and B cells in wildtype mice are 0.0003, In the Result (page 6), “… The mean velocity of T cells (5.3 ± 1.7 μm/min) was significantly higher than that of B cells (4.1 ± 1.4 μm/min) during intra-PVC migration (Fig. 1E), while the dwell time and total path length in the PVC were not significantly different between T and B cells (Fig. 1, H and I). Similar results were obtained when both cells were imaged simultaneously, except that B cells had significant longer dwell time than T cells (Fig. 2C-F and Fig. S5). Interestingly, more than half of the T and B cells crawled from 50 μm to 350 μm inside the PVC (Fig. 1I), …”

      In the legend of Fig. 2, “… P values between T and B cells in wild-type mice were 0.0003 (C), …”

      In the legend of Fig. S5, “… P values between T and B cells in wild-type mice were 0.0240 (A), 0.3614 (B), 0.7518 (C) and 0.1337 (D). …”

      **Minor comments**

      1. Please provide evidence for GlcNAc6ST1 deficiency in HEV and surrounding tissues.

      Previous studies (Uchimura et al., JBC 2004, Nat. Immunol. 2005; ref9 and 10, respectively, in the manuscript) confirmed systemic deficiency of GlcNAc6ST-1 in peripheral lymph nodes of the GlcNAc6ST-1 KO mice.

      Images for delayed trans-FRC migration in GlcNAc6ST1 KO mice relative to WT are not convincing (Fig. 2G and H).

      We think the reason why the images look unconvincing is probably because it is not easy to quickly determine the images corresponding to the trans-FRC migration in the image sequence. To make the transmigration images easier to recognize, we added arrow heads indicating the transmigration site in Fig. 2G and 2H, and Fig. S4 as follows.

      Provide actual time periods required for Fig. 3F and G. Lack of isotype control IgG experiment in Fig. S3.

      We added the time periods (3 hours) in the figure legend as follows.

      “… (F) Average numbers of labeled T and B cells transmigrating at one site for 3 hours. (G) Ratio of hot spots to total transmigration sites for 3 hours. …”

      The purpose of Fig. S3 was to confirm that the anti-ER-TR7 antibody injection for labeling FRC do not alter normal T cell motility, rather than to confirm the function of ER-TR7. Therefore, we used non-injected group as control rather than control antibody injection group.

      Line 12 on page 11, "the ratio of hot spots to the total “observed” transmigration sites..." is not appropriate. The ratio must be calculated by hot spots to the total "potential" transmigration sites, although it is challenging to find total potential sites.

      We corrected the expression from ‘the total observed transmigration sites’ to ‘the total potential transmigration sites’.

      Please correct typos of angiomoduin to angiomodulin (page 16), ET-TR7 to ER-TR7 (page 17), Anti-CD3 to anti-CD3 (page 22), half the dose to half dose (page 22), the Multiple step to the multiple step (page 23).

      We thank the reviewer for finding those errors. We corrected them and performed proofreading repeatedly to correct typos and grammatic errors.

      Please provide an additional explanation of why actin-DsRed in HEVs is more strongly expressed than surrounding tissues such as FRCs in Fig. 1 although actin-DsRed should be expressed in all cell types in mice.

      We were also surprised when we found that HEV ECs expressed red fluorescence more strongly compared to surrounding tissues. Although the other cells such as FRCs and endogenous lymphocytes also express DsRed under control of a promotor gene, beta-actin, we believe that HEV ECs express more strongly, which is sufficient to image only HEV-EC by adjusting an image contrast. We revised the explanation of this point in the Methods (page 21) as follows.

      “HEV ECs of actin-DsRed mouse popliteal lymph node expressed red fluorescence much stronger than the surrounding stromal cells and endogenous lymphocytes, which was sufficient to image only HEV ECs by adjusting an image contrast (Fig. 1, A and B).”

      Reviewer #1 (Significance (Required)):

      The study focused on lymphocytes post extravasation of HEV, which is an understudied question, using intravital imaging. The in vivo imaging study was deliberately and beautifully performed, and the finding is insightful for understanding lymphocyte trafficking in lymph nodes. However, additional experimental should be performed to address some weaknesses listed in our comments.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      The present study by K. Choe meticulously monitored the stepwise transmigration behavior of T cells and B cells, respectively, through the high endothelial venules of the mouse popliteal lymph node using the laser scanning confocal microscopy. In particular, the study focused on the post-luminal migration of T and B cells and reported the following. (1) Mice deficient in GlcNAc6ST-1 which is necessary for PNAd expression on the abluminal side of HEV showed significantly reduced abluminal migration of both T and B cells, (2) the footpad injection of the ER-TR7 antibody did not affect T cell transmigration across HEVs but marginally increased the parenchymal T cell velocity when compared with injection of control antibody, (3) T cells and B cells tended to share FRC migration hot spots but this was not the case with trans-EC migration hot spot, (4) the trans-FRC migration was observed at the FRCs closely associated with CD11c+ dendritic cells in HEV.

      While the present study is obviously the product of very meticulous and time-consuming work, it basically describes only a phenomenology, just reporting the lymphocyte behavior within and outside lymph node HEVs, without sufficiently analyzing the mechanistic aspect of the individual event they observed. The only antibody blocking experiments they performed to obtain mechanistic insights was by the use of commercially available monoclonal antibodies, all of which unfortunately contained a preservative, sodium azide, which potently blocks lymphocyte migration in vivo (Freitas AA & Bognacki J, Immunol 36:247, 1979). Therefore, the results of these antibody blocking experiments cannot be taken at face value.

      We thank the reviewer for raising the important point. Freitas et al used pre-treated lymphocytes with sodium azide in vitro for 1 hour while we injected the antibody into the footpad of recipient mouse 3 hours before lymphocyte injection via tail vein and imaging. Sodium azide might be highly diluted in vivo condition. In addition, Fig. S3 shows no significant difference in T cell migration in HEV between anti-ER-TR7 antibody-injected and non-injected groups although the anti-ER-TR7 antibody also contains sodium azide. We believe that the effect of sodium azide on our convincing results of the PNAd-blocking antibody compared to the control antibody (Fig. S8) may be insignificant. The potential side effect of sodium azide was mentioned in the Methods of the revised manuscript (page 22) as follows.

      “All antibodies we used contains sodium azide that has potential side effects on lymphocyte migration in lymph node57. However, Fig. S3 shows no significant difference in T cell migration in HEV between anti-ER-TR7-injected and non-injected groups.”

      Reviewer #2 (Significance (Required)):

      Real time imaging experiments were performed very carefully. However, as mentioned above, authors used sodium azide-containing antibodies for blocking experiments, and hence, these experiments cannot be interpreted properly.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      This study presents a detailed investigation of T and B cell entry into lymph nodes (LN) via HEV. Substantial high quality intravital imaging is used to examine trans-EC and trans-FRC migration and define the role of PNAds in this process. The authors find that T and B cells use 'hot spots' to cross EC and FRC barriers, which supports prior similar observations by others. They also show that where T and B cells cross EC and FRC layers can differ, with regions of shared trans-FRC migration but more distinct EC crossing sites. This may relate to differences in the structure of these cellular layers, but provides novel insight into the mechanisms of cell entry into LNs via HEV. Assessment of the dependence on PNAd using antibodies or GlcNAc6ST-1 KO mice revealed perivascular and parenchymal cell behavior is also influenced by these signals. Lastly, examination of DCs that sit on the perivascular FRCs suggested that cells may prefer to cross at sites co-localized by DCs, although the reasons for this are not explored.

      This is a well performed study, with high quality imaging data and analysis. The results are convincing, with sufficient numbers of mice and adequate statistical analysis. There are a number of minor grammatical errors throughout the text, which should be easy to fix.

      We thank the reviewer for the positive evaluation. We carefully performed proofreading repeatedly to correct typos and grammatical errors.

      Reviewer #3 (Significance (Required)):

      Although 'hot spots' have been proposed by others, this detailed analysis provides new knowledge of how lymphocytes can cross the HEV and FRC barriers to enter LNs. This is an important study to advance our understanding of cell recruitment to lymph nodes. The role of perivascular and parenchymal PNAd signals observed here should also be of interest to immunologists to help define the signals required for immune cell motility in tissues.

      Reviewer #4 (Evidence, reproducibility and clarity (Required)):

      The authors have used a combination of intravital confocal imaging and transgenic models to study the migration of T and B cells through the HEVs. They move on from Moscacci et al. and Park et al., studies on lymphocyte migration. This study focuses on visualization and molecular mechanism of post-trans-EC migration, including the intra-PVC and trans-FRC migration of T and B cells in HEVs. They have been able to show how lymphocytes migrate through the HEV into the parenchyma. Using the GlcNAc6sT-1 (catalyst for sulfation of PNAds) KO model (and MECA control for PNAds blocking) they identify the role of L-selectin/PNAd for lymphocyte transmigration. The identification of hot spots of T and B cell transmigration in HEVs is novel and extremely interesting for the field however the data shown is not entirely convincing in their current form. The hot spots were defined as areas where the lymphocytes migrate through the HEV epithelial cells and pericyte (FRC) regions. These are areas where migration was greatly shared T and B cells. Using the CD11c-YFP mouse model they identified CD11c+ cells in proximity to the FRCs located at the migration hotspots which can drive further speculation regarding the mechanism by which these areas of the HEVs are more permissive.

      **Major comments**

      1) Intravital imaging of T and B cell transmigration across HEVS composed of ECs and FRCs

      • Figure 1: The authors mention that they performed similar experiments for B cells. Authors should show comparative data for T cells and B cells.

      • Panel S1B should be provided for both T and B cells in figure 1.

      We added the image sequence of B cell migration and the panels (Fig S1B of previous manuscript) showing intra-PVC segments of T or B cells in Fig. 1C of the revised manuscript as follows.

      2) T and B cells preferentially share hotspots for trans-FRC migration not EC-migration

      • Figure 4: This data is important to the storyline but as presented it is difficult to understand. Results are overstated in the text however it is difficult to see where these conclusions come from based on the figure. In Figure 4B the authors should show percentages on the Venn diagram or remove it entirely. In Figure 4C the authors should add labels to their y-axis and separate the data in order to assist with the storyline and convince of the presence of hot spots.

      We agree with the reviewer’s opinion. We removed the Venn diagram, separated the Fig. 4C into 4B and 4C, and added y-axis labels in the figures. In addition, we revised the figure legends and the text in the Results to make it easier to understand as follows.

      In the figure legend, “…(B-C) The round and diamond symbols represent predicted and observed values, respectively, for the percentage of T cell hot spots in B cell hot spots (B), for the percentage of B cell hot spots in T cell hot spots (C). …”

      In the Results (page12), “Simultaneously imaging T and B cells showed that some T and B cells transmigrated across FRCs at the same site (Fig. 4A and Movie S8). To investigate whether T and B cells share their hot spots preferentially or accidentally, we compared the percentage of T cell hot spots in total B cell hot spots (diamond symbols in Fig. 4B) with its predicted value that is the possibility of accidently sharing T and B cell hot spots (round symbols in Fig. 4B). The predicted value can be calculated as the percentage of T cell hot spots in total transmigration sites. To note, the percentage of hot spots in total sites for trans-FRC migration was higher than that for trans-EC migration (Fig. 3G and round symbols in Fig. 4B) maybe because the number of trans-FRC migration sites was less than that of trans-EC migration sites. It implies that the possibility of accidently sharing T and B cell hot spots for trans-FRC migration is higher than that for trans-EC migration. However, surprisingly, the percentage of T cell hot spots in B cell hot spots was significantly higher than its predicted value of accidently sharing hot spots for trans-FRC migration (Fig. 4B). Similarly, the percentage of B cell hot spots in T cell hot spots was also significantly higher than its predicted value for trans-FRC migration (Fig. 4C). These results imply that T and B cells preferentially share trans-FRC migration hot spots beyond the prediction for accidently sharing. However, there were no significant differences between observed and predicted values for trans-EC migration (Fig. 4B and 4C), which implies T and B cells just accidently share their trans-EC migration hot spots.”

      3) T and B cells prefer to transmigrate across FRCs covered by perivascular CD11c+ DCs

      • DCs drive changes to FRC phenotype and contractility. The interaction between CLEC-2 (on DCs and platelets) is important for driving permeability of the HEVs. The authors use the CD11c-YFP mouse model in Figure 5 (and the supporting figures) to show the proximity of the CD11c+ cells and FRCs. Data from Baratin et al., (Immunity, 2017) suggest that CD11c+ cells in the parenchyma are also T cell zone macrophages (TZMs) that were previously characterized as DCs. Macrophages have previously been shown important for perivascular transmigration of neutrophils during bacterial skin infection (Abtin et al.2014- Nat Immun). CD11c-YFP alone does not show the cells proximal to FRCs are DCs so the authors should try to stain them with CLEC-2 or use the CLEC9a-cre mouse model to better characterise these cells.

      We thank the reviewer for raising important point. We agree that the perivascular CD11c+ cells could be T-cell-zone macrophages (TZMs). Better characterization of the CD11c+ cells located in the hot spots of HEVs is required to determine if they are DCs or macrophages, and also to differentiate them from the other CD11c+ cells observed in the non-hot-spot regions of the HEVs. To differentiate DCs from TZMs, Zbtb46-GFP mouse can be used for imaging because Zbtb46-GFP are highly expressed in conventional DCs (cDCs) but not monocytes, macrophages, or other lymphoid or myeloid lineages (Satpathy et al, JEM 2012). However, endothelial cells also express Zbtb46-GFP. To visualize only DCs in HEVs, we need to make a chimeric mouse by adoptive transfer of Zbtb46-GFP bone-marrow cells into irradiated wild-type mouse. Furthermore, using a triple transgenic mouse with Zbtb46-cre;tdTomato and CD11b-GFP will be able to differentiate 3 types of DCs and TZMs potentially associated with the hot spots: Zbtb46+CD11b- cDC1 (red), Zbtb46+CD11b+ cDC2 (yellow), and Zbtb46-CD11b+ macrophage (green). However, since generation or obtaining of those transgenic mice models including CLEC9a-cre mouse will take long time, we will leave this work as future research and discussed this point in the Discussion of the revised manuscript as follows. In addition, we think that it will be difficult to differentiate the CLEC2 of perivascular DCs from that of platelets by in vivo labeling by injection of anti-CLEC2 antibody conjugated with a fluorescent dye because the CLEC2 of platelets maintains HEV integrity with interacting of FRC podoplanin (Herzog et al, Nature 2013).

      In the Discussion (page 19), “… In addition, better characterization of the CD11c+ DCs located in the hot spots of HEVs is required to differentiate them from the other CD11c+ DCs observed in the non-hot-spot regions of HEVs. Some T-cell-zone resident macrophages can also express CD11c54. Imaging of a triple-transgenic mouse with Zbtb46-cre;tdTomato and CD11b-GFP will be able to differentiate 3 types of DCs and macrophages potentially associated with the hot spots: Zbtb46+CD11b- cDC1 (red), Zbtb46+CD11b+ cDC2 (yellow), and Zbtb46-CD11b+ macrophage (green)54,55.”

      **Minor comments**

      1) Intravital imaging of T and B cell transmigration across HEVS composed of ECs and FRCs

      • The velocity differences observed could be due to location of HEV in the parenchyma. Furthermore, FRC plasticity can cause differences in secretion of chemokine gradients based on the location of cells and their niche (Rhoda et al., Immunity 2018). HEVs regulation of lymphocyte entry can be influenced by their niche (Veerman et al., Cell Reports 2019). The authors should comment on the HEV position relative to B cell areas.

      We included this point with the references (Rhoda et al, immunity 2018, ref 27; Veerman et al., Cell Rep. 2019, ref 60) in the Discussion of the revised manuscript (page 19-20) as follows.

      “Compared to T cell, B cells took a longer time to pass EC and FRC layers in HEV and had lower velocity in PVC and parenchyma just after extravasation. Furthermore, the adhesion rate of B cells to HEV EC in luminal side is lower than that of T cells5. These could be attributed to lower expression of L-selectin and CCR7 on B cells than T cells18,59. The difference in homing efficiency between T and B cells may vary depending on the HEV location due to the heterogeneous expression of chemokines and integrins on HEV EC and surrounding FRCs in peripheral lymph node27,60. The HEVs imaged in this work were located around 40-70 μm depth from the capsule where might be close to B cell follicles. B cell homing efficiency in the deeper paracortical T cell zone could be different from our data probably due to less CXCL13 that is chemoattractant for B cells highly expressed in follicles. …”

      • Images shown in Fig1A is the same as Fig S1A/B. I presume this is an error.

      Fig. 1A and Fig. S1A correspond to a 20-um-thick maximum intensity projection and single z-frame without projection, respectively. To avoid the confusion, we changed Fig.1A to the single z-frame (Fig S1A) and remove the 20-um thick maximum projection.

      • Figure S3: Data for Ab treated appears to be identical to what is shown for T cells in Fig 1. I presume this is an error and the correct control will be shown.

      We used the data of Fig. 1D-1I as the Ab-injected group in Fig. S3. We are sorry for the lack of clear explanation about this. We included the explanation in the figure legend as follows.

      In the legend of Fig. S3, “(A-E) There is no significant difference between antibody-injected group (Ab) and non-injected group (Non) in T cell migration from trans-EC migration to trans-FRC migration. Non-injected means that no substance is injected into a footpad of mouse. We used the data of Fig. 1D-1I as the antibody-injected group. …”

      2) Non-redundant role of L-selectin/PNAd interactions in post-luminal migration of T and B cells in HEV

      • Could the authors clarify the number of mice used for this analysis (same applies to figure 1)

      In the legends of Fig. 1-2, S6 and S8, there is the number of mice we used. In Fig. 1, “Four and 3 mice were used for the analysis of T and B cells, respectively.” In Fig. 2, “Four mice were analysed for each group.” In Fig. S6, “Three mice were analysed for each group.” In Fig. S8, “Five and 4 mice were analysed for the control Ab and MECA79 groups, respectively.”

      In addition, we added the number of mice in the legend of Fig. S7. In Fig. S7, “The images are representative of 4 popliteal lymph nodes of 2 mice and 2 popliteal lymph nodes of a mouse for MECA79 and control IgM antibody, respectively.”

      • Figure S6: further to percentages of T cell populations the authors should also provide the number of T cells (CD4, CD8, CM and naive) for both wildtype and KO.

      We included the analyzed cell number by FACS in Fig. S6 and revised the figure legend as follows.

      In the Fig. S6, “… (B) Analyzed cell numbers by FACS for 3 control and 3 KO mice. (C) Percentage of each type of T cells in DsRed+ T cells. No difference in the percentage of homing central memory, Naïve CD4 and CD8 T cells between wild-type and KO mice. …”

      **Methods** for the flow cytometry analysis could the details of how samples were processed (or reference) be provided.

      We added the details in the Methods (page 24) as follows.

      “Popliteal and inguinal lymph nodes were harvested and single-cell suspensions were prepared by mechanical dissociation on a cell strainer (RPMI-1640 with 10% FBS). Cell suspensions were centrifuged at 300g for 5 min. Erythrocytes in lymph nodes were lysed with ACK lysis buffer for 5 min at RT. Cell suspensions were washed and filtered through 40um filters. Non-specific staining was reduced by using Fc receptor block (anti-CD16/CD32). Cells were incubated for 30 min with varying combinations of the following fluorophore-conjugated monoclonal antibodies: anti-CD3e (clone 145-2C11, BD pharmigen), anti-CD4 (clone GK1.5, BD Pharmingen), anti-CD8 (clone 53-6.7, eBioscience), anti-CD44 (clone IM7, Biolegend) and anti-CD62L (clone MEL-14, eBioscience) antibodies (diluted at a ratio of 1:200) in FACS buffer (5% bovine serum in PBS). After several washes, cells were analyzed by FACS Canto II (BD Biosciences) and the acquired data were further evaluated by using FlowJo software (Treestar).

      **References:** The discussion covers key references in the field, but more recent studies should be included. Some examples have been suggested in the comments sections. Key references missing that can help discussion/interpretation of the data include: 1) Veerman et al 2019, Cell reports. The data in that paper shows the heterogeneity of the HEV and different regulation of genes that control lymphocyte entry. This can also be linked to the comments above regarding section 1 and 2. 2) Rhodda et al 2018, Immunity that focuses on niche-associated heterogeneity of lymph node stromal cells. The authors should also include Webster et al., 2006, JEM which describes the role of DCs in regulating vascular growth in the lymph node.

      We thank the reviewer for suggesting good references to discuss. We included the references #1 and #2 in the revised manuscript as we responded to the minor comment #1. We also cited Webster et al., JEM 2006 (as ref 65) in the Discussion of the revised manuscript (page 20) as follows.

      “Inflamed peripheral lymph node become larger by recruiting more lymphocytes and even L-selectin-negative leukocytes that are excluded in the steady state63,64. Inflamed HEV ECs show different gene expression, such as downregulation of GLYCAM1 and GlcNAc6ST-160. In addition, inflamed HEV integrity may be loosen due to markedly increased leukocyte influx although the HEV FRCs can prevent bleeding by interacting with platelet CLEC-248. CD11c+ DCs are associated with inflamed HEV EC proliferation that is functionally associated with increased leukocyte entry65. The stepwise migration of lymphocyte across inflamed HEVs and their hot spots with perivascular CD11+ DCs will be interesting topic for future study.”

      Reviewer #4 (Significance (Required)):

      This paper asks important questions and can make a significant contribution to the field if all revisions are addressed. The authors identified PNAd as an important factor for T cell migration. Further to previous studies in the field suggesting non-random transmigration sites. The authors used intra-vital confocal imaging to identify how lymphocytes cross the epithelial cells and FRCs of the HEVs to migrate to the parenchyma. The authors identify hotspots used by lymphocytes to transmigrate. Finally, the authors show that CD11c+ cells are proximal to FRCs hotspots and might have a role in driving lymphocyte transmigration.

      Audience: Lymphocyte/immune cell biology, stomal immunology, FRC and lymph node inflammation. My expertise: Stomal immunology, immunology, innate immunity

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #4

      Evidence, reproducibility and clarity

      The authors have used a combination of intravital confocal imaging and transgenic models to study the migration of T and B cells through the HEVs. They move on from Moscacci et al. and Park et al., studies on lymphocyte migration. This study focuses on visualization and molecular mechanism of post-trans-EC migration, including the intra-PVC and trans-FRC migration of T and B cells in HEVs. They have been able to show how lymphocytes migrate through the HEV into the parenchyma. Using the GlcNAc6sT-1 (catalyst for sulfation of PNAds) KO model (and MECA control for PNAds blocking) they identify the role of L-selectin/PNAd for lymphocyte transmigration. The identification of hot spots of T and B cell transmigration in HEVs is novel and extremely interesting for the field however the data shown is not entirely convincing in their current form. The hot spots were defined as areas where the lymphocytes migrate through the HEV epithelial cells and pericyte (FRC) regions. These are areas where migration was greatly shared T and B cells. Using the CD11c-YFP mouse model they identified CD11c+ cells in proximity to the FRCs located at the migration hotspots which can drive further speculation regarding the mechanism by which these areas of the HEVs are more permissive.

      Major comments:

      1) Intravital imaging of T and B cell transmigration across HEVS composed of ECs and FRCs

      • Figure 1: The authors mention that they performed similar experiments for B cells. Authors should show comparative data for T cells and B cells.
      • Panel S1B should be provided for both T and B cells in figure 1.

      2) T and B cells preferentially share hotspots for trans-FRC migration not EC- migration

      • Figure 4: This data is important to the storyline but as presented it is difficult to understand. Results are overstated in the text however it is difficult to see where these conclusions come from based on the figure. In Figure 4B the authors should show percentages on the Venn diagram or remove it entirely. In Figure 4C the authors should add labels to their y-axis and separate the data in order to assist with the storyline and convince of the presence of hot spots.

      3) T and B cells prefer to transmigrate across FRCs covered by perivascular CD11c+ DCs

      • DCs drive changes to FRC phenotype and contractility. The interaction between CLEC-2 (on DCs and platelets) is important for driving permeability of the HEVs. The authors use the CD11c-YFP mouse model in Figure 5 (and the supporting figures) to show the proximity of the CD11c+ cells and FRCs. Data from Beratin et al., (Immunity, 2017) suggest that CD11c+ cells in the parenchyma are also T cell zone macrophages (TZMs) that were previously characterised as DCs. Macrophages have previously been shown important for perivascular transmigration of neutrophils during bacterial skin infection (Abtin et al.2014- Nat Immun). CD11c-YFP alone does not show the cells proximal to FRCs are DCs so the authors should try to stain them with CLEC-2 or use the CLEC9a-cre mouse model to better characterise these cells.

      Minor comments:

      1) Intravital imaging of T and B cell transmigration across HEVS composed of ECs and FRCs

      • The velocity differences observed could be due to location of HEV in the parenchyma. Furthermore FRC plasticity can cause differences in secretion of chemokine gradients based on the location of cells and their niche (Rhoda et al., Immunity 2018).HEVs regulation of lymphocyte entry can be influenced by their niche (Veerman et al., Cell Reports 2019).The authors should comment on the HEV position relative to B cell areas.
      • Images shown in Fig1A is the same as Fig S1A/B. I presume this is an error.
      • Figure S3: Data for Ab treated appears to be identical to what is shown for T cells in Fig 1. I presume this is an error and the correct control will be shown.

      2) Non-redundant role of L-selectin/PNAd interactions in post-luminal migration of T and B cells in HEV

      • Could the authors clarify the number of mice used for this analysis (same applies to figure 1)
      • Figure S6: further to percentages of T cell populations the authors should also provide the number of T cells (CD4, CD8, CM and naive) for both wildtype and KO.

      Methods:

      for the flow cytometry analysis could the details of how samples were processed (or reference) be provided.

      References:

      The discussion covers key references in the field but more recent studies should be included. Some examples have been suggested in the comments sections.Key references missing that can help discussion/interpretation of the data include: 1) Veerman et al 2019, Cell reports. The data in that paper shows the heterogeneity of the HEV and different regulation of genes that control lymphocyte entry. This can also be linked to the comments above regarding section 1 and 2. 2) Rhodda et al 2018, Immunity that focuses on niche-associated heterogeneity of lymph node stromal cells. The authors should also include Webster et al., 2006, JEM which describes the role of DCs in regulating vascular growth in the lymph node.

      Significance

      This paper asks important questions and can make a significant contribution to the field if all revisions are addressed. The authors identified PNAd as an important factor for T cell migration. Further to previous studies in the field suggesting non-random transmigration sites. The authors used intra-vital confocal imaging to identify how lymphocytes cross the epithelial cells and FRCs of the HEVs to migrate to the parenchyma. The authors identify hotspots used by lymphocytes to transmigrate. Finally the authors show that CD11c+ cells are proximal to FRCs hotspots and might have a role in driving lymphocyte transmigration.

      Audience: Lymphocyte/immune cell biology, stomal immunology, FRC and lymph node inflammation.

      My expertise: Stomal immunology, immunology, innate immunity

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      This study presents a detailed investigation of T and B cell entry into lymph nodes (LN) via HEV. Substantial high quality intravital imaging is used to examine trans-EC and trans-FRC migration and define the role of PNAds in this process. The authors find that T and B cells use 'hot spots' to cross EC and FRC barriers, which supports prior similar observations by others. They also show that where T and B cells cross EC and FRC layers can differ, with regions of shared trans-FRC migration but more distinct EC crossing sites. This may relate to differences in the structure of these cellular layers, but provides novel insight into the mechanisms of cell entry into LNs via HEV. Assessment of the dependence on PNAd using antibodies or GlcNAc6ST-1 KO mice revealed perivascular and parenchymal cell behaviour is also influenced by these signals. Lastly, examination of DCs that sit on the perivascular FRCs suggested that cells may prefer to cross at sites co-localised by DCs, although the reasons for this are not explored.

      This is a well performed study, with high quality imaging data and analysis. The results are convincing, with sufficient numbers of mice and adequate statistical analysis. There are a number of minor grammatical errors throughout the text, which should be easy to fix.

      Significance

      Although 'hot spots' have been proposed by others, this detailed analysis provides new knowledge of how lymphocytes can cross the HEV and FRC barriers to enter LNs. This is an important study to advance our understanding of cell recruitment to lymph nodes. The role of perivascular and parenchymal PNAd signals observed here should also be of interest to immunologists to help define the signals required for immune cell motility in tissues.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      The present study by K. Choe meticulously monitored the stepwise transmigration behavior of T cells and B cells, respectively, through the high endothelial venules of the mouse popliteal lymph node using the laser scanning confocal microscopy. In particular, the study focused on the post-luminal migration of T and B cells and reported the following. (1) Mice deficient in GlcNAc6ST-1 which is necessary for PNAd expression on the abluminal side of HEV showed significantly reduced abluminal migration of both T and B cells, (2) the footpad injection of the ER-TR7 antibody did not affect T cell transmigration across HEVs but marginally increased the parenchymal T cell velocity when compared with injection of control antibody, (3) T cells and B cells tended to share FRC migration hot spots but this was not the case with trans-EC migration hot spot, (4) the trans-FRC migration was observed at the FRCs closely associated with CD11c+ dendritic cells in HEV.

      While the present study is obviously the product of very meticulous and time-consuming work, it basically describes only a phenomenology, just reporting the lymphocyte behavior within and outside lymph node HEVs, without sufficiently analyzing the mechanistic aspect of the individual event they observed. The only antibody blocking experiments they performed to obtain mechanistic insights was by the use of commercially available monoclonal antibodies, all of which unfortunately contained a preservative, sodium azide, which potently blocks lymphocyte migration in vivo (Freitas AA & Bognacki J, Immunol 36:247, 1979). Therefore, the results of these antibody blocking experiments cannot be taken at face value.

      Significance

      Real time imaging experiments were performed very carefully. However, as mentioned above, authors used sodium azide-containing antibodies for blocking experiments, and hence, these experiments cannot be interpreted properly.

    5. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Extravasation of lymphocytes from HEV in the lymph nodes is mediated by the interaction between lymphocyte L-selectin and PNAd-carrying sulfated sugars expressed by HEVs. Multiple steps of lymphocyte migration interacting with ECs at the luminal side of HEVs have been studied intensively; however, post-luminal migration steps are unclear. In this study, using intravital confocal microscopy of peripheral lymph nodes (pLNs), the authors found that GlcNAc6ST1 deficiency, required for sulfation of PNAd, delays trans-fibroblastic reticular cell (FRC) migration of lymphocytes, and hot spots of trans-HEV EC migration and trans-FRC migration. Interestingly, hot spots of trans-FRC migration are often associated with dendritic cells (DCs). Thus, the authors concluded that FRCs delicately regulate the transmigration of T and B cells across the HEV wall, which could be mediated by perivascular DCs.

      Main comments:

      1. This study focused on pLNs, which are quite different from mesenteric lymph nodes (mLNs) in many ways. The authors should include mLNs in their study to make the general statement with regard to the T/B cell entry into lymph nodes. In addition, it will be more significant if this study includes challenged pLNs.
      2. The finding that GlcNAc6ST1 deficiency delays lymphocyte trans-FRC migration but not trans-HEV EC migration is surprising. However, the reason this occurs is neither shown nor discussed. Is GlcNAc6ST1 also expressed in FRCs? Or does GlcNAc6ST1 expression on HEV license lymphocytes to transmigrate across FRCs?
      3. Because of the adoptive transfusion experiment, the actual number of transmigrating lymphocytes in Fig. 3F is underestimated.
      4. Whether DCs covering FRCs have a role for lymphocyte trans-migration is not shown.
      5. In Fig. 1, time required for trans HEV EC migration and trans-FRC migration of T cells is shorter than that of B cells; however, this finding is not observed in Fig. 2C and E.

      Minor comments:

      1. Please provide evidence for GlcNAc6ST1 deficiency in HEV and surrounding tissues.
      2. Images for delayed trans-FRC migration in GlcNAc6ST1 KO mice relative to WT are not convincing (Fig. 2G and H).
      3. Provide actual time periods required for Fig. 3F and G. Lack of isotype control IgG experiment in Fig. S3.
      4. Line 12 on page 11, "the ratio of hot spots to the total;observed' transmigration sites..." is not appropriate. The ratio must be calculated by hot spots to the total "potential" transmigration sites, although it is challenging to find total potential sites.
      5. Please correct typos of angiomoduin to angiomodulin (page 16), ET-TR7 to ER-TR7 (page 17), Anti-CD3 to anti-CD3 (page 22), half the dose to half dose (page 22), the Multiple step to the multiple step (page 23).
      6. Please provide an additional explanation of why actin-DsRed in HEVs is more strongly expressed than surrounding tissues such as FRCs in Fig. 1 although actin-DsRed should be expressed in all cell types in mice.

      Significance

      The study focused on lymphocytes post extravasation of HEV, which is an understudied question, using intravital imaging. The in vivo imaging study was deliberately and beautifully performed, and the finding is insightful for understanding lymphocyte trafficking in lymph nodes. However, additional experimental should be performed to address some weaknesses listed in our comments.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Our response to reviewers has been provided as a formatted typeset pdf file. This includes the original review comments (bolded) and our responses. In particular, our responses include several figures. Our intention is to include the full set of reviews and responses as supplementary information in our manuscript once published at a journal - we would also be happy to have this document uploaded to biorXiv for readers as well.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Summary:

      Kannan et al start with the good idea of using Shannon entropy as a way to temporally classify the development of cells, quantifying their maturation status by implementing it on single cell gene expression as measured by scRNAseq. The idea behind is that as cells develop, genes are silenced and hence the overall GeX entropy goes down. This approach would allow a robust method to compare heterogeneous datasets, an important problem that current scRNAseq analysis methods (such as Monocle) using dimensionality reduction are unable to robustly perform this task. Unfortunately the analysis and calculation of the entropy and also the results obtained do not generate convincing proof that Entropy is actually a good metric for comparing development in diverse datasets/cell types.

      Major Comments:

      -The calculation of the entropy is not clear enough (or not performed correctly).Shouldn't Pi be the GeX distribution of Gene i across all cells? The authors seem to have calculated Pi as the probability of expression in one cell then summed across. Unless I am wrong, this does not make sense and invalidates all the analysis.

      -Entropy score correlated only moderately with pseudotimes for the three methods. This is a major problem that needs to be explained. One would expect entropy to give a higher correlation if it is a robust measure of development.

      -One of the main purposes of the approach is to classify maturation of in vitro datasets, but basically no entropy changes are found. They are minimal in figures 5c. Following with this, the developmental times of the datasets as shown by color codes do not match the changes in entropy (see Figs 4b, 5a/b.

      Minor Comments:

      -Also Pi being a probability, how was the normalization performed so that the sum of the probability is 1. Given the variability in gene expression, scRNAseq platforms and number of cells it would be good to have a metric estimating the quality of the distribution. -why is the entropy not compared between the Kannan dataset and Wang and Yao? This would prove that indeed entropy is a good measure as opposed to UMAP+monocle.

      Fig 3 should be in the supplement.

      Significance

      The idea behind this study is of potential significance as well stated by the authors, but the implementation of these ideas lacks scientific rigor. Entropy analysis needs to be repeated or clarified/better explained.

      Referees cross-commenting

      After reading the other reviewers comments showing the relevance of the approach developed by the authors, I do feel that with some clarification/discussion regarding the technical questions of the analysis solving the doubts I expressed, the manuscript could be of interest.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      The manuscript does a fairly exhaustive job of comparing and bench-marking different single cell/nucleus RNA-seq on in vivo cardiomyocytes and in vitro cardiomyocyte differentiation protocols. The analyses is clearly described.

      Minor comments, questions and clarifications sought:

      It may be useful to emphasize that matching the entropy score of in vivo cardiomyocytes (or a given CM developmental state) is not a sufficient indication of matching the expression patterns of the in vivo counterpart. Compare entropy scores from cardiomyocytes from snRNA-seq on post mortem tissue (Litviňuková, et al. Nature volume 588, pages466-472(2020)) There are differences in cardiomyocytes obtained from different regions of the human heart (atrial vs. ventricular, left vs. right, etc.). It will be informative to compare the many in vitro differentiation datasets (and protocols) that may give result in atrial-like or ventricular-like CM to their in vivo counterparts. This question pertains to in vitro CM differentiation: Is entropy score sensitive to cell-types that differentiate into alternative lineages during in vitro differentiation (issue of purity)? Different cell lineages may have different maturation rates and if they are not excluded, the non-cardiomyocyte cells could contribute to noisy measurements. If the entropy score is calculated after a first round of clustering, on identified CM among the population (as opposed to cardiac progenitor cells, for example), I would be more confident of the entropy score.

      This also pertains to in vitro CM differentiation: Even within the cardiomyocyte lineage, there may be different rates of development that ultimately lead to the same end point. Therefore there may be the need to coarse-grain the developmental time-points to account for the precocious ones and the 'late bloomers'. It may be useful to anchor the developmental trajectory based on entropy score to biological milestones (such as when the CM's start beating in plates). Can the authors comment on this, please?

      CM's are interesting in that they are post-mitotic and as such, will attain a level or maturity at the end of the maturation process. I can imagine this not being the case for cells that continue to cycle and divide. It would be interesting to compare the change in entropy score for such cells. How about cells that differentiate when activated by an external stimulus (e.g., immune cells)? As long as a cell has high transcriptional variability or is transcriptionally active (e.g., as stress response) it may still show high entropy score. How would one interpret Entropy scores in such situations?

      The authors note "higher mtGENE in differentiated cells and later time points."- Fig 2a. Could this be related to difficulty in dissociation, as part of stress response? The authors note "In particular, 10x Chromium and STRT-seq datasets appeared to have systematically higher percentages of ribosomal protein-coding genes than other protocols." Could this simply be due to higher transcript capture rate of these protocols? These protocols/techniques may not be statistically sampling a cell's transcripts at the same rate as the techniques with "lower" capture efficiency.

      Can entropy score be used in the context of activation (under external stimulus) or deactivation (when the external stimulus is removed)?

      What do the black dots represent in Fig 2c?

      Significance

      The manuscript, "Transcriptomic entropy benchmarks stem cell-derived cardiomyocyte maturation against endogenous tissue at single cell level" by Kannan et al. introduces an interesting phenomenon, transcriptional entropy to track the rate of maturation in an important in PSC-derived cardiomyocytes. The need for cardiomyocyte in translational and clinical research along with the difficulty in getting live, mature cardiomyocytes from humans and make it imperative that in vitro systems are sought. Being able to characterize the rate of differentiation and maturation in these in vitro systems is also valuable and in that respect, the manuscript does a fairly exhaustive job of comparing and benchmarking different cardiomyocyte differentiation protocols that have been profiled by sc/snRNA-seq to date. Most importantly, comparing entropy scores between in vitro and in vivo counterparts is a simple and elegant way to anchor in vitro differentiation to pre- and post-natal development. Another interesting aspect of transcriptional entropy measure in a single cell is that it is independent of neighboring cells, and is therefore a conceptually different and novel way to characterize single cell data that, to date, have been analyzed by techniques that group cells by each cell's similarity to others. The study is well conceived and systematically explored. The manuscript is also well written. I recommend that the manuscript be accepted for publication.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Kannan et al. have developed an approach based on the quantification of gene distributions to assess pluripotent stem cell (PSC)-derived cell and tissue maturation. Methodologically, they combined single cell RNA-seq (scRNA-seq) with bioinformatic and statistical approaches to calculate transcriptomic entropy scores to benchmark cellular maturation. Their findings address unresolved issues regarding the developmental state of isolated cells and current problems associated with cell population heterogeneity. As model systems, the authors focused on cardiomyocytes (CMs) from mouse heart and on CMs generated through in vitro differentiated of PSCs from human. The authors examine a spectrum of CMs from mouse heart as a function of developmental time and provide evidence showing that scRNA-seq captures maturation related changes. Using a modification of the Shannon entropy of scRNA-seq and CMs isolated from embryonic, fetal, neonatal and early adult mouse hearts, they show that transcriptomic entropy scores decrease with developmental time. The authors then extend their results to human cells and perform a meta-analysis of publicly available scRNA-seq datasets. When cross-study comparisons were performed, meaningful comparisons could only be generated after gene and cell filtration. The output of the resulting workflow and computed entropy scores show good concordance among cells generated using different in vitro differentiation and different isolation techniques, and between stage-matched mouse and human tissues. The authors go on to show that in vitro derived CMs or reprogrammed CMs (from fibroblasts) undergo an apparent developmental block to maturation in vitro. The relevance of their approach to other cell systems was demonstrated using datasets from pancreatic beta cells and hepatocytes. In summary, the calculated entropy scores recapitulate known CM maturation gene expression profiles, making this approach invaluable for future comparisons between engineered and in vivo derived tissues.

      Comments:

      The key conclusions of the manuscript by Kannan et al. are supported by an examination of multiple datasets and the use of extensive and complementary bioinformatic and statistical analyses. The authors utilized a digestion and cell sorting approach that permits the isolation of viable CMs from mouse heart. The choice of scRNA-seq approaches eliminated cell type heterogeneity (either physically or bioinformatically) from otherwise complex cell populations. The authors then employed a variety of analytical approaches to identify limitations to cross-data comparisons and to define the maturation state of the cells. By minimizing protocol-related biases, resolving mismapping of mitochondrial reads to pseudogenes, taking into account variations in study sensitivity, and excluding datasets of relative poor quality, they were able to develop an informative workflow to generate meaningful entropy scores to benchmark maturation in cross-study and cross-species comparisons. These comparisons were validated using reprogrammed fibroblasts, hepatocytes and pancreatic beta cells. Overall, the experiments were well designed, the experimental and bioinformatic limitations addressed, and the conclusions supported by robust datasets, entropy scores, bioinformatics and statistics. This leads me to conclude that their validated approach will be of significant value to other researchers who need to benchmark cell maturation using a quantitative, transcriptome-based approach.

      A few experimental additions or discussion points would have strengthened the overall impact of this study.

      First, the process of cell dissociation coupled with cell sorting may be associated with a time lag in sample preparation that might be expected to affect RNA stability. If comparisons were performed between scRNA-seq and bulk RNA-seq, would the entropy scores have been equally informative or would differences have been observed from RNA instability that may have affected the entropy scores? While this test would be difficult with in vivo acquired cells, such a comparison could have been made using purified (but not sorted) hPSC-CMs. An answer to this question might be valuable to investigators who wish to use your approach to examine existing bulk RNA-seq datasets. Basically, is the workflow only applicable for scRNA-seq data where problems of cell heterogeneity can be eliminated, even though you provide evidence on how to exclude non-CMs from your datasets using transcriptome profiles?

      Second, would mouse strain differences or sex differences cause a shift in the entropy scores or pseudotime analyses, even if only marginally? Not all mouse models develop at the same rate and sex is known to affect both murine fetal and infant growth.

      Third, when performing the entropy scores and pseudotime analyses, were there specific transcripts or groups of transcripts that were more informative of specific stages of maturation? You mention that ~81.5% were identified as differentially expressed by all methods and some transcript profiles are shown in Figure 4e, but were any informative genes or gene sets (i.e., markers) more useful for assessing maturation that would not require scRNA-seq? This information (which could be added in the supplement) might make your approach more accessible to the broader research community (i.e., the identification of new and informative markers of CM development or differentiation). Alternatively, it may be that scRNA-seq is required. If so, then this should be discussed. Finally, could you comment further on the application of entropy scores to study maturation and how your approach may be of value to the research community? A number of situations beyond comparisons of engineered and in vivo tissues, and somatic cell reprogramming protocols might include an evaluation of PSC-CMs for pharmaceutical and toxicity testing, and the prediction of pathways that may be essential for maturation of cells either through a gene regulatory network or through individual signaling pathways. While these experiments and discussion points are not necessary to support your conclusions, an evaluation of these points and limitations in the Discussion may broaden the paper's impact and significance.

      As minor critiques, there are a few typos (e.g., celltypes [cell types]), redundancies (e.g., ...transcript and protein level expression [...transcript and protein levels.]), and some improvements to the figures that could be made. For the latter, the font sizes are often too small (Figs 1, 3, 4, 5), as are some of the timepoints listed on the x axis (Fig 3a,d, 4b). Otherwise, the figures are visually informative, and the supplemental data are necessary to the assessment of the procedure.

      Significance

      The approach describe by Kannan et al. represents a significant advance over existing strategies to benchmark maturation states of PSC derivatives. Gene expression studies1 and transcriptome-based studies2-4 have been useful to estimate the developmental state of mouse and human PSC-CMs; however, most published studies have relied either on an assessment of a few markers or on data from a limited number of in vivo derived samples. These earlier studies were further limited by the confounding problem of heterogeneous cell populations. Omics based quantitative approaches have been proposed for improved maturation benchmarking and have proved valuable to study the differentiation of stem cells to progenitors and to committed lineages. 5-9 In this paper, Kannen et al. have improved upon these approaches and report the use of entropy scores to benchmark in vitro PSC-CM maturation against a gold standard of in vivo counterparts. The result is a reference resource that captures transcriptomic profiles from mouse CMs across a broad range of developmental states that will be particularly valuable to the cardiac field. By extending the assessments to include meta-analyses and cross-species comparisons (mouse versus human), they have established a workflow that results in a meaningful benchmark a cell's maturation state. Kannan et al., thus, have developed a quantitative and reproducible approach (entropy score) that simultaneously resolves issues of cell heterogeneity and estimates then in vivo maturation state of in vitro derived cells. This quantitative approach is likely to advance studies designed to assess drug and toxicity testing of more "adult-like" CMs, and adoption of this approach by the broader stem cell community will likely prove invaluable for the assessment of engineered tissues made from complex cell populations and for applications to regenerative medicine.

      Keywords: Reviewer's field of expertise Cardiovascular Physiology, Stem Cell Biology, Omics

      References:

      1. AC Fijnvandraat, et al., Cardiomyocytes derived from embryonic stem cells resemble cardiomyocytes of the embryonic heart tube. Cardiovascular Research 58, 399-409 (2003).
      2. E Poon, et al., Transcriptome-guided functional analyses reveal novel biological properties and regulatory hierarchy of human embryonic stem cell-derived ventricular cardiomyocytes crucial for maturation. PLoS ONE 8, e77784 (2013).
      3. CW van den Berg, et al., Transcriptome of human foetal heart compared with cardiomyocytes from pluripotent stem cells. Development (Cambridge, England) 142, 3231-3238 (2015).
      4. H Uosaki, et al., Transcriptional Landscape of Cardiomyocyte Maturation. Cell Reports 13, 1705-1716 (2015).
      5. D Grun, et al., De Novo Prediction of Stem Cell Identity using Resource De Novo Prediction of Stem Cell Identity using Single-Cell Transcriptome Data. Cell Stem Cell 19, 266-277 (2016).
      6. W Chen, AE Teschendorff, Estimating Differentiation Potency of Single Cells Using Single- Cell Entropy (SCENT). Comput. Methods for Single-Cell Data Analysis 1935, 125-139 (2019).
      7. M Guo, EL Bao, M Wagner, JA Whitsett, Y Xu, SLICE : determining cell differentiation and lineage based on single cell entropy. Nucleic Acids Res. 45, 1-14 (2017).
      8. AE Teschendorff, T Enver, Single-cell entropy for accurate estimation of differentiation potency from a cell's transcriptome. Nat. Commun. 8, 1-15 (2017).
      9. GS Gulati, et al., Single-cell transcriptional diversity is a hallmark of developmental potential. Science 367, 405-411 (2020).
    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      To whom it may concern:

      We are thank the reviewers for their kind assessment of our work and its potential impact. Here we have outlined key points that we plan to address during revisions.

      1. The erect wing story could be investigated a bit further. We agree the erect wing phenotype is intriguing, and will try to improve our understanding. We plan to use fat-body specific c564-Gal4 or BaraA-Gal4 to express UAS-BaraA and attempt to rescue the phenotype. In this way, we will also give insight into whether erect wing can be rescued by immune-tissue or BaraA-endogenous tissue effects. We will note that the cause of erect wing may be due to a lack of BaraA during development and/or during the immune response, which will require careful investigations in the future.
      2. The in vitro antifungal data are modest. We agree. We will perform additional experiments to further corroborate these data to increase confidence in the trends observed.
      3. The nature of the genetic backgro__unds is not clear.__ We will do our best to explain the genetic background complications in the main text. We use w; **∆BaraA flies as an independent means of confirming isogenic data (and vice versa). We had to backcross the ∆BaraA mutation with an arbitrary genetic background prior to experiments to remove an off-site mutation that we detected in the antifungal gene Daisho2 (formerly IM14). As such, there is no appropriate wild-type control for these flies as the background is mixed. We include OR-R as a generic wild-type representative. OR-R flies survive bacterial infection like w; **∆BaraA in multiple assays, and so we feel that different immune competences of the genetic backgrounds is unlikely to explain major susceptibilities to fungal infection. We have additional data for bassiana R444 infection (Fig. 4C-D) with a second wild-type that we can include if desired, which shows similar trends when compared to w; **∆BaraA. We will also perform additional experiments with newly-generated isogenic flies to increase confidence in the trends, and to better inform on interactions between BaraA and other immune effectors. For other minor points, we will be happy to make suggested changes to improve clarity of the figures or methodology.

      Best regards,

      Mark Hanson and Bruno Lemaitre

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Summary:

      Hanson et al. have set out to investigate the BaraA gene, and show that the gene encodes for several immune induced molecule (IM) peptides, namely IM10, IM12 (and its sub-peptide IM6), IM13 (and its sub-peptides IM5 and IM8), IM22, and IM24. Flies lacking BaraA are viable but susceptible to specific infections, notably by the entomopathogenic fungus Beauveria bassiana. Furthermore, they show that BaraA is antimicrobial and, when combined with the antifungal Pimaricin, it inhibits fungal growth. In principle, this is a nicely written paper with interesting findings. The authors show induction of BaraA with different micro-organisms and where BaraA is expressed, using a BaraA reporter. The exploration of the genomic area, showing the duplication of the BaraA locus is really nice work. Also, the survival experiments show quite clear phenotypes and therefore effects for BaraA.

      Major comments:

      Line 153, related results: Fold induction of BaraA is greater with E. coli (~50) than with C. albicans (~20) or M. luteus (~6) - any comments on this? Also, infection times with these microbes are different - some comments about BaraA kinetics? Based on Fig 1B, BaraA looks to be highly induced by E. coli, although in Fig 1C, after 60h, reporter induction by E. coli is much less than with M. luteus. Some clarification about the kinetics of BaraA in these different infection models is needed.

      Erect wing phenotype in males: This is a bit surprising finding/interpretation. I have also seen erect wings in E. faecalis-infected flies, but I am not sure now in which flies I saw this; I have never tried to quantify this nor made any notes about females/males in this context. I normally use Myd88 RNAi (VDRC #25399) as a control in my experiments, and if they were the ones showing the erect wing phenotype in a prevalent manner, they would also lack BaraA (which is dependent on the Toll pathway function). At the time of doing my experiments, I just interpreted this in the way that the flies looked "sick", they were lifting their wings up and walking around rather than flying. When monitoring my survival experiments, I assumed that the ones with wings up were the ones dying next (the sickest). What is your interpretation; are the flies still ok or very sick, when this erect wing starts to appear?

      Minor comments:

      Wording: In the intro, line 78: "Many of the genes that encode these components of the immune peptidic secretome have remained largely unexplored." - I would say "had remained" until recently - especially the quite recent Bomanin work and work with Daisho1 & 2 have brought about a lot of new information about this "immune peptidic secretome".

      Fig 1A: What is BaraA called in DeGregorio et al? Can't find it (easily) from their lists. Please write BaraA into the Fig 1A graph, to make it clearer. Also, write somewhere in the text or Figure legend what the gene is called in DeGregorio et al (CG33470? CG18278? something else?)

      Line 238 Reference to Supplementary data file 1: In the supplementary data files I downloaded, I can't see the files numbered as data files 1 and so on. Instead, there are folders (Fly stocks, NF-kappaB sites, Primers used) and the files have names. Please clarify that the supplement names match the text.

      Significance

      • Describe the nature and significance of the advance (e.g. conceptual, technical, clinical) for the field. State what audience might be interested in and influenced by the reported findings.

      I think the significance of this work is great for Drosophila immunity researchers. The nature and mode of action of many of the Toll pathway -induced peptides is not known, so more information on them is much appreciated by the field. Also, studying molecules with potential antimicrobial activities is also potentially interesting for wider audience.

      • Place the work in the context of the existing literature (provide references, where appropriate).

      The main Drosophila immunity pathways are the Toll and the Imd pathway, and when activated, several immune effector genes are induced. In 2015, a group of Toll pathway target genes was identified by mass spectrometry, that the authors here call "the immune peptidic secretome". (Clemmons AW et al., PLoS Pathogens 2015). Many of these peptide genes have been uncharacterized, although emerging studies have shed light to these findings in the past three years (Lindsay SA et al J. Inn. Imm. 2018; Cohen LB et al. Front.Imm. 2020). This research brings about new information on yet uncharacterized peptides in this group.

      • Define your field of expertise with a few keywords to help the authors contextualize your point of view. Indicate if there are any parts of the paper that you do not have sufficient expertise to evaluate. Drosophila melanogaster, innate immunity, humoral immunity, cellular immunity Toll pathway, Imd pathway, immune-induced molecules
    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary:

      Hanson et al. investigated an antifungal gene they named BaraA, which codes for a protein that is proteolytically processed into 8 smaller peptides. BaraA expression is induced by Toll pathway signaling with minor input from the Imd pathway. It is expressed in the fat body upon immune challenge and expressed in other tissues such as head and eyes. Overexpression of BaraA increased the survival of animals defective in both IMD and Toll pathways. In vitro, the combination of the three major BaraA peptides displayed modest inhibitory effect on fungal pathogens when combined with the antifungal drug Pimaricin, however BaraA peptides alone showed little or no antifungal activity. BaraA deficient mutants showed little to no significant difference in bacterial resistance but appeared to show susceptibility to fungal infection; this fungal susceptibility was independent of the Bomanins. Male BaraA mutants also displayed an erected-wings phenotype when subjected to infection.

      There are 3 key findings:

      • BaraA overexpression conferred protection against fungal infection.
      • BaraA-derived peptides displayed antifungal activity in vitro in conjunction with Pimaricin, in vitro
      • Loss of BaraA decreased fungal resistance.

      Major Concerns:

      The results from the overexpression experiments were clear. However, the second and third findings were less convincing.

      • The cocktail of IM10-like BaraA peptides showed significant synergy with Pimaricin in killing C. albicans at only one dose out of the five tested, and this combination has modest (19-29%) inhibition on hyphae growth of B. bassiana. The in vitro antifungal experiments might be more compelling if other fungi were examined and/or combinations with other antifungals were investigated, where synergy might be more robust.
      • The most problematic issue with this data is the control of genetic background in the study of the BaraA mutant strains. Much of the survival data compares mutant strains (BaraA and/or Bom∆) with Oregon-R as a wildtype. As best we can tell, the BaraA and Bom strains are not in the genetic background and neither is particularly similar to OR-R. If the authors can justify the use of OR-R as the wildtype control for these experiments, they should do so explicitly. Otherwise, these experiments are very difficult to interpret. This issue is highlighted by other data, where genetic background is carefully controlled, in the iso-w background, and the survival phenotypes are much more mild, and do reach significance is some infections, by log-rank analysis. All experiments should be performed in this controlled background to enable firm conclusions and interpretations.

      Minor comments:

      • Figure 1A mined data from a previous published study, which is acceptable, but this data presentation lacks proper description of the methodology, reproducibility, and statistics.
      • The authors need to clarify the condition of the flies in Figures 1D to G (as well as S1C and D). Infected? Baseline? It is not clear.
      • There is no visualization of the genomic location of the BaraA deletion, which should be added to figure 2C.
      • The authors should include the full genotype information for the Bloomington stocks, since the BL numbers may change over time.
      • In Figure 2C, the authors should include some information about which lines possess the single BaraA locus and which lines have the duplication event.
      • The author should elaborate on what is known about Dso2 and how the aberrant Dso2 locus might affect their assays. The info here is incomplete and confusing.
      • Does the Ecc15 strain used in the paper innately resist Ampicillin? If yes, then the result of Ecc15 resisting the combination of IM cocktail and Ampicillin does not reveal much.
      • It is unclear what the concentration of pimaricin was used for Figure 3E.
      • The authors should include a clear genetic explanation for their conclusion that BaraA and Bomanins function independently. The text describing this double mutant analysis could be more informative.
      • BaraA overexpression significantly improved female survival against M. luteus (Figure S4C, p=0.006), this is interesting but not mentioned in the text.
      • The author should be clear and consistent about the pathogen source (lab grown vs. commercial) and method of infection (natural infection vs. septic injury). The authors should explain the difference in virulence between different infection models and methods.
      • The sex-specific erected wings phenotype is interesting, but does not contribute to the overall significance of the manuscript. The authors should consider moving Figure 6 to the supplement.

      Significance

      This work is a potential step in characterizing the immune effectors downstream of the Toll pathway that contribute to the Drosophila defense against fungal pathogens. These effectors so far have not been characterized and understood. We are familiar with the Toll pathway and its effectors, but in no way are experts.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary:

      The authors use the fruitfly Drosophila melanogaster as a model to study innate immunity. In this manuscript, they study the effects of a set of antimicrobial peptides (AMPs) that are produced by furin cleavage of a larger precursor (Baramicin A, BaraA). Bara A is immune-induced in a Toll-dependent manner and has antifungal activity. Somewhat in line with expression in non-immune tissues, BaraA mutants show ab erect-wing phenotype in males.

      Major comments:

      The experiments are well-presented in a reproducible and statistically sound way. In particular care is taken to control effects of the genetic background. The immune phenotype of BaraA mutants is somewhat subtle but convincing and in line with recent findings by the same authors that some of the recently created CRISPR/Cas mutants in antimicrobial peptides have broader effects while others target intruders in a more specific manner or in combination with other AMPs. These are very relevant studies, which provide a balanced view of innate immunity in particular AMP action. I have one comment about the (BarA dependent, male-specific) erect wing phenotype: this is an interesting observation, which could stimulate work by others, I guess this is one reason why it was included in the manuscript. On its own it stands out a bit in the manuscript since in contrast to other parts, where insight into the underlying mechanisms is provided, this is not the case for the erect wing phenotype. The authors speculate about the non-immune expression, which may be responsible. One might use tissue-specific knockdown or rescue to check up on this (wing muscle or nervous system). This would be cost effective but delay publication for a few months. It depends a bit on the respective journal policy and the plans for further investment of the groups involved whether the phenotype is considered part of BaraA pleiotropism (which I could buy) or is considered too descriptive and should be used later for a later publication. Along similar lines, while sex-specific immune phenotypes are highly interesting, they open up many discussions about the underlying causes, both proximal and ultimate.

      Minor comments:

      The experiments look sound and previous work is mentioned sufficiently. The experimental design and results are easy to follow. I have mentioned some concerns about the erect wing phenotype (see above). Is there any evidence for metabolic regulation of BaraA (TF binding sites for example) in particular in the promoter fragment used for the reporter line? Did any of the fat body drivers show the same effect as the ubiquitous actin driver (this would increase specificity).<br> Why was pimaricin used, it seems presently as a representative of membrane-active antifungal drugs, which BaraA peptides are likely not. Still, using combinations with other insect (Drosophila) antifungal AMPs would be more physiological, maybe this was tried and did not work, but should still be discussed. Or do the authors want to imply that physiologically the Daisho peptides or Bomanins have this effect? Perhaps elaborate on this.

      In Fig. 1: part H is missing although mentioned in the legend.

      In the abstract:

      it should be more clearly mentioned that the erect wing phenotype was observed in the mutants. line 27 and 28, replace one "characterized" line 28: contribute line 33: entomopathogenic

      other places:

      line 68: AMPs

      Significance

      Significance:

      The evolutionary relevance and therapeutic potential of AMP synergism is an emerging topic both within insect immunity, innate immunity in general and its use in patient treatment [1, 2]. The latter aspect may be interesting to justify the use of pimaricin. Thus, the work presented here in combination with previous work from the authors leads to a more balanced view of the action of insect AMPs (the authors call that the logic of the Drosophila effector response) with implications for human innate immunity and perhaps even therapy of diseases. Therefore, the data will be interesting for a broad audience. The use of models such as Drosophila, which can be manipulated in a targeted manner has provided insights that are beyond the study of single AMPs in vitro. Still, using overexpression as done in some cases here should be interpreted cautiously and should - if available - compared to data on in vivo concentration of AMPs (the authors try to derive estimates from the MS data), which may be difficult in case there are large local differences.

      My own field:

      primarily insect immunity with a background in mammalian immunity, although I am not able to keep up with all recent development in mammalian immunity.